Talk:Verified Publishing Names

Jump to navigation Jump to search

Much of this discussion is refactored from User talk:Marc Kupper‎#Verified publishers. -DES Talk 23:35, 15 September 2008 (UTC)

Inital Discussion

On the Publisher:Verified publishers page, you say:

To help support this project an easy thing ISFDB editors can do when entering and verifying publications is to make sure that the publication record's publisher name field accurately reflects what's stated on the title page. For example, rather than just "Ace" you would use the full imprint or publisher name as stated on the title page such as "Ace Books", "Ace Books, Inc.", or "Ace Science Fiction Books".

I am not in the least convinced that this is desirable. I think it merely leads to further fragmentation in the publisher names, an impedes the movement toward regularization. Special questions of a few verifiers on a few targeted publications are one thing, but attempting to do this in general is quite another. For example, if a books as "Baen Books" I will pretty much invariably enter it as simply "Baen", "Tor Books" I always enter as simply "Tor" and I think this is a good thing, a positive virtue. Those publishers are IMO already verified, we don't need more data to know their canonical names, we should simply be using the canonical names. The same applies, I think to Ace, at least outside the "specials". Furthermore, i think that outside sources are FAR more useful than publication references in verifying a publisher's name or names over time, and should be emphasized at this stage.

I urge more discussiuon of what a "verified publsiehr" is and means, and how we will get and use the verification data, before proceeding with this project. -DES Talk 03:51, 14 September 2008 (UTC)

I agree but at the moment there's no way to regularize publisher names within the DB, no support software, and there are no rules regarding publisher names..Marc Kupper (talk) 04:36, 14 September 2008 (UTC)

It's not my intent not my intent to create any new rules regarding publication entry. As noted earlier, the entire thing is an experiment. I'm going to try it for a while and see if it implodes the way the existing ISFDB publisher names and to a lesser extent, the publisher namespace already have. I agree with you that the publisher/imprint names are fragmented and perhaps all we'll find out from the project is verification that the names are fragmented to the point that efforts to document them need another approach. Marc Kupper (talk) 04:36, 14 September 2008 (UTC)
There's a big assumption in there too, that the title page HAS the information we want. Just picking up a few of my "to do" pile suggests otherwise: e.g.
  1. "Scholastic" is all that's on the title page, "Scholastic Children's books, an imprint of Scholastic Ltd" is on the copyright page. It goes on to state the book is published by "Scholastic UK Ltd" and that "SCHOLASTIC is a trademark of Scholastic Inc". We'd record a trademark and lose the stated imprint?
  2. "Ace Books, Inc" on title page, "An Ace book" on copyright page, "Ace Book" on spine, whereas "ace first in gothics" on cover suggests (to me at least) that separating Ace Gothics from Ace Science Fiction might be something useful.
  3. "Ace Books, New York" on title page, references to "Ace" and "Ace Books" on copyright page, but "Ace Science Fiction" on spine looks more useful.
  4. Just a logo on title page and spine, from which you can pick out a single letter "G". Copyright page tells you it's Gollancz, from the time it was an imprint of Orion.
  5. "Fontana/Collins" on title page (one of the very few times I've seen a '/' truly used in a publisher name), "Fontana Science Fiction" on front cover.
I think you have to search the book a bit harder to discern imprint, publisher and publishing groups: and I think capturing all those in the wiki might help (despite the search problems we'll still have with Ace, Tor, NEL, NAL, etc) but what you're suggesting for the publisher field goes contrary to what's actually been happening for months now. Can we keep this experiment to the wiki for the moment please? BLongley 11:55, 14 September 2008 (UTC)
What's wrong with additional 'fragmentation'. This is an experiment, if it's successful it could result in some fragmentation, and if it is discontinued, it seems like it would be a small effort to 'restandardize' the entries in the experiment. Kevin 04:40, 16 September 2008 (UTC)
Yes, it would be very easy to "restandardize" a Publisher in the database given the tools Al has given us: any Mod can override ALL verified publications with the "Wrong" imprint or publisher on. For instance, I could merge Corgi and Bantam on the grounds that Corgi WOULD have been called Bantam if the UK trademark for Bantam hadn't been unavailable in the 1950s. It's because the Mods so far have NOT used these tools to massacre our data that it's taken literally months to reduce from 9,424 publishers to our current 8,300 or so - we're being careful NOT to trample on things people generally support enough to verify. If the Wiki gets lots of extra pages that the database doesn't eventually link to, no problem: the database is still sound, and the Wiki has pages that might state why it's no longer important. If we have to "restandardise" the database then we can pull out the BFG tools and "just do it" or we can go through all the hard work of checking if a Publisher entry is supported by a verifier not in this project that ought to be consulted. As I said, it's taken six months to do a small bit of regularisation, and if I (or the other "regularisers"") had to go through that again we might be tempted to do it without checking that every pub, in every year, has NOT got someone that ought to be consulted first. BLongley 22:18, 16 September 2008 (UTC)
My point was that any 'entries' in this expeiment will be well documented, and it would not be a major effort to standardize just those entries that added to the fragmentation if the experiment were closed at a later date. That was all. Kevin 22:52, 16 September 2008 (UTC)
There's no problem with fragmentation in the Wiki: any page in the Wiki can be left alone happily (IMO), even if the last update for "London: Panther", for instance, adds "You're using a lazy librarian's abbreviation, Panther were actually based in St. Albans. Use Publisher: Panther". Or if it ends with a REDIRECT page. Actually, I'm tempted to add similar Wiki pages to stop such misuse of Library Catalogue data as being canonical. I'm against undesired regularisation in the database: I'm probably against such in the Wiki too. What do you think should/would be standardised if the experiment ends? BLongley 00:03, 17 September 2008 (UTC)
I added a new publication article, SHLDTFBFWF1978. which documented exactly what's stated in a publication. For example, I'd been in the habit of using "Ltd." with the period but discovered that usage, at least in GB in 1978 by that publisher, is "Ltd" without a period and so that version got used on the page. I then then wrapped the names found with {{vn}} tags. This gave me a page full of red links and one blue link for Tandem.
I then went to each of the pages and created stub articles. To simplify the process I added a new template, {{refp}}, which also allowed me to add a data point to the existing Tandem article without being too obtrusive. Overall, the process seems easy though we do have 'fragmentation' in that many new names were documented with the value being that they are the actual names used by the publisher and we know the relationships between the names at that point in time. I did not use the project header banner other than on the source publication nor did I template the article wording though I tried to keep the wording consistent.
At some point I see people creating canonical articles, such as Corgi which will serve to consolidate the fragment/stub articles. There are a couple of ways a canonical article can consolidate stubs. One is to just reference the fragment/stub articles in the body of the canonical article. Another method that's more work is to wiki-redirect the fragment/stub articles to sections in the canonical article. People using the redirect approach will need to be aware that the entire redirect article needs to be in one line. For example, to change Universal-Tandem Publishing Co. Ltd into a redirect to a canonical Tanden article you would need to have
#REDIRECT [[Publisher:Tanden#Universal-Tandem Publishing Co.]] {{Verified Publisher}}
or perhaps
#REDIRECT [[Publisher:Tanden#{{PAGENAME}}]] {{Verified Publisher}}
and within the canonical Tanden article you'd have a section about Universal-Tandem Publishing, it's imprints, etc. Marc Kupper (talk) 20:23, 16 September 2008 (UTC)

Copyright issues

At present I'm unaware of any reliable outside sources other than one publication whose name escapes me a the moment but where the author/editor spent a few years documenting physical publications, writing letters to publishers, and phoning them. The results are copyright and could not be used in ISFDB. This project has similar goals to that book but will be available under creative commons.Marc Kupper (talk) 04:36, 14 September 2008 (UTC)

The factual contents of a reference work are not and cannot be protected by copyright. We could not copy, in general, specific text from such a work. But if such a work states that publisher "James & Co." became "James & Johns" in 1945, "James, Johns, & Smith" in 1967, and part of ""JJ Communications inc" in 1995, we can report those facts and cite that publication as a source, adn we should do so. The Fiest vs Rural decison very specifically said that soemone who published a directlry could not prevent others from copying information from that directory and using or indeed republishing it, even in a competing directory. Facts are not copyrightable under US law. -DES Talk 15:16, 14 September 2008 (UTC)
I certainly intend to enter what I find from the "Dictionary of Trade Name Origins", where it covers an imprint or publisher. It certainly clarifies why Corgi and Transworld exist, for instance. BLongley 15:33, 14 September 2008 (UTC)
Extracting and/or rearranging facts is fine but there's a slippery slope into copyvio. It's less of an issue when the source is data in tabular form. If the data is presented in English sentences then the work is copyright though obviously the facts can still be extracted. "Joe was born in 1951" is copyright but it's ok to document that Joe's birth falls in 1951. Obviously for copying "Joe was born in 1951" literally you would be looking at the portion of the work and the more subjective "is this an original expression?" Anyway, the slope is there and FWIW, I'm fine with existing projects such as Don Erikson's adding/updating publication records based on a source reference as he's only filling in database metadata and citing the source for that data. Marc Kupper (talk) 19:01, 14 September 2008 (UTC)
Note also that "obvious and natural" expressions of facts are generaly not protected either: "Joe was born in 1951", would, if factual, not be protected, as it is a very obvious and natural way of expressing that fact. The more creative and individual the expression, the more it is likely to be protected. But for straight facts, any significant paraphrasing is enough to avoid any copyright issue, and surely extracting facts from a set of sentances and putting them into tabular form or bulletted list form would be perfectly ok. Above you implied that the contents of reference works could not be used in this effort "The results are copyright and could not be used in ISFDB" you siad. That is incorrect, the results of any such project can indeed be used, but the words expressing those results may need to be changed. -DES Talk 15:55, 15 September 2008 (UTC)

Project Page location & name

By the way, I would suggest moving Publisher:Verified publishers to ISFDB:Verified publishers project or just ISFDB:Verified publishers, as that is the proper namespace for a project page.-DES Talk 15:18, 14 September 2008 (UTC)

I thought about that when creating the project but the existing projects all seemed to be data consistency cleanup related and this is about documenting the names in the wiki and not a database cleanup. If there is a "Publisher Names Cleanup" project then it could be patterned after the existing Project:Author Names Cleanup. In any case, I don't have strong feelings for where the project's home page should go and if it should be in the root or ISFDB namespace for example. Technically it's not an ISFDB database project meaning I'd lean towards the root namespace.
Before someone moves the article - is "Verified publishers" an agreeable project name? I could not think of a single word that would apply to publishing groups, publishers, and imprints. Calling it "Verified publishing group, publisher, and imprint names" seems like a mouthful. While the present output of the project is three categories it's possible more could get added.
Something I thought of this morning when falling asleep is to not bother with setting up verified pages but rather all we care about are verified names meaning the existing ref/note/etc. templates would be used on the standard publisher pages to flag "this is an actual name/address spotted in a publication." The big advantage of pages is that categories can be constructed from them. I don't think the Wiki offers a mechanism where on a page you can say "Include the name 'xyzzy' to the Verified Imprint category/list." Marc Kupper (talk) 18:40, 14 September 2008 (UTC)

(unindent) Based on the feedback so far I did an edit to Publisher:Verified publishers to

  • Remove the advice concerning what to put in the ISFDB publisher field.
  • Reword things to make it clearer that it's to document the names in a verifiable fashion on the wiki.

I'd like feedback on this. I need to take a break but I'll also tone down the wording about "don't make additions unless they are sourced" though ideally people would get in the habit of doing that automatically. As DES noted, very few use the <ref> tag and on Wikipedia one constant battle for their moderators is encouraging people to use and cite reliable sources (I suppose I should cite that. <g>). Marc Kupper (talk) 18:40, 14 September 2008 (UTC)

"Verified Publishing Names" seems general enough for the project to cover groups, companies, subsidiaries, divisions, imprints, sub-imprints, etc. You could even start deriving "printers" from some stuff I've added. Maybe even publisher series, which are a bit lost at the moment - e.g. the "Corgi SF Collector's Library" fits fairly nicely on the Corgi page, but "Venture SF" moved from Hamlyn to Arrow. You seem to have a goal in mind already, which lacks some of those levels. We can probably bodge a few of them together to fit, but some idea of how to link up or down (or even when we SHOULD link - should we link verified to unverified or vice versa?) would be good. It's the lack of links I find particularly missing. It's obvious there are hierarchies involved, which currently cannot be represented in the database, but should be eventually: demonstrating those in the wiki would give a good guideline as to what people want. BLongley 21:00, 14 September 2008 (UTC)
Verified Publishing Names is a good name and will allow for adding subsidiaries, divisions, publisher series, etc. I've moved the page. The links between articles will be added as the names get developed. I don't think you will find a hierarchy except at specific instants in time.
I suspect what will happen is that some of the smaller articles will end up as redirects to sections of larger or canonical articles. For example the Yearling imprint may end up as a section on Bantam Doubleday Dell Publishing Group, Inc. and the Yearling page will be a redirect that also includes {{Verified Imprint}} so that it gets included in the Imprints category. People would still reference Publisher:Yearling and the redirect will take them to the section about Yearling.
As for verified vs. unverified. Here's a thought - Right now we are linking to names directly, such as Publisher:Yearling. If a name is verified we can change these links to instead use {{verified name|Yearling}} (or {{vn|Yearling}} and that will color code the links much like the existing {{a}} and {{p}} link-templates do. We may even be able to add hover-text so that when the mouse hits it there's a "Verified Name" hover box.
I will be toning down the "do not edit this stuff" wording in the headers and may even eliminate the headers entirely though at present they serve to help me remember where the home page and categories are...

Project goals & entity levels

As for "You seem to have a goal in mind already" - yes - the goal is documenting when and where names were used as I've often run into a name and wondered where and when that name was used. ISFDB's publication field often contains too much noise though it's been helpful to at least point me in a direction. Marc Kupper (talk) 22:23, 14 September 2008 (UTC)

Yes, but you've only defined three types of name you want: I can already see up to eight possibilities. By defining just three it makes it look already decided that that's all we need. And sure, a sub-imprint like Publisher:Corgi Yearling can mostly be treated the same way as the parent imprint Publisher:Corgi, and "Divisions" and "Groups" might be equally treated. "Printers" might be discardable but as they are sometimes the only distinguishing text characteristic between two books they might occasionally be worthwhile checking. And some Publishers like Collins were also Printers for other publishers. BLongley 18:58, 15 September 2008 (UTC)
By the way, I'm glad to see that Printers are now included in the experiment. I'm not convinced they are important enough to be in the database, but there's no harm in having them in the Wiki. BLongley 22:27, 16 September 2008 (UTC)
One of my major concerns is that you're often going to clash the working publisher pages with the verified publisher pages. (And similar for imprints.) I definitely don't want the pages for useful notes on future database entries suddenly becoming rule-bound pages that people are afraid to edit because it's too much trouble - we already have people too afraid of the ISFDB interface to use it, make that the case for the Wiki too and we'll lose another set of possible editors, and the data they would have provided. Maybe some guidelines on how to split a page into unverified and verified bits would help? BLongley 21:00, 14 September 2008 (UTC)
I agree but also don't want to "encourage" that people just add stuff and so I'm trying to raise awareness and use of references/citations. Right now when people add what looks like unsupported material to ISFDB the moderators hold it up and ask "what does the publication really say?" I'd like to see the same mindset applied to wiki articles. Unfortunately, it's a lot of work. It's more fun to just write down when you know and spot via a causual scan of the Internet rather than plowing through 30 year old copies of Locus. My desk is a disaster of stuff to finish before the end of the day meaning I don't have a lot of time to edit the wording at the moment. Marc Kupper (talk) 22:23, 14 September 2008 (UTC)
I've worked through a few publishers now, which I do recommend people trying, but I doubt I'll do a lot of it for the stated project reasons. I'd like you to look at what I have entered under the new guidelines: I have identified some "publishers" and "publishing groups" and confirmed some imprints along the way. But where I've found "Verification Sources" to be useful, it hasn't been for ANY of those: it's been for addresses, logos, and artist signatures. I'd particularly like to know if what I've entered for Corgi, Carousel, Hamlyn, Bantam etc is data that would lead you to a DIFFERENT conclusion from mine as to what constitutes an imprint, or a publisher, or a publishing group. I think for a start we definitely need some regularization rules: why are your first examples "Inc." for instance, rather than "Inc" or "Incorporated"? Should mine be "Hamlyn Publishing Group Ltd." with a full-stop that wasn't present in the source? Or "Limited" to give it the official name? BLongley 21:00, 14 September 2008 (UTC)
Overall though, I still think the basic expressed desires of what data we WANT recorded is still too loose. I've messed up the Corgi page with all the logo data I found - it would be GREAT if Corgi was an imprint where publication dates were often missing, and a logo change would put a date-range to such a publication. But actually Corgi books are pretty good at recording that stuff, and such might be better used for Macfadden or Lancer, for instance. BLongley 21:00, 14 September 2008 (UTC)
I tried to use the names/addresses exactly as stated. Of course, I spot a copyright page that says "New York, NY" at the top and "New York, New York" at the bottom for the same street address but at least the top was the address of the imprint and the bottom was the publisher group. I'll need to think about what you added for Publisher:Hamlyn Publishing Group Ltd. There certainly do seem to be a bunch of names and I'm wondering if an "undefined" category needs to be added. For example STRHNTCKCK1985 mentions "First published in Great Britain 1985 by Hamlyn Paperbacks." What is "Hamlyn Paperbacks"? Obviously we can add it to the name list but where?
From Duncton Wood evidence, it appears to be a "Division"', which is why I suggested you might not have defined all the levels we will require. Doubleday became a "Division" too - see Carpe Jugulum, but in that case "Doubleday" is also the imprint, whereas I would say "Hamlyn Paperbacks" printed books under the "Hamlyn" imprint. BLongley 18:58, 15 September 2008 (UTC)
It's possible we will change the page titles to use normalized names such as "Hamlyn Publishing Group" and under that verify sources for various suffixes rather than cluttering a category with variant names. I need to run. Marc Kupper (talk) 22:37, 14 September 2008 (UTC)
I'm done with ISFDB for the weekend. I started with Publication:STRHNTCKCK1985, tagged names with {{vn}}, and then started clicking on red links copy/pasting in pages. I'll clean this up some more when I get a chance as it looks to add a stub publisher, etc. page could be reduced to a copy/paste with the tricky part being to remember to change the publication reference. Marc Kupper (talk) 03:36, 15 September 2008 (UTC)

(unindent) Different corporate families will use different numbers of levels, and use different names for them. The levels and their names will also change at different times within the same corporate entity or related group of entities. I suggest that we use only three levels.

A consistent name or line used in the marketing of books which is clearly part of a larger business entity. May or may not have its own editing staff and policies. Generally does not act as a separate business entity, and does not have separate legal existence.
A separate business entity. The publisher's name is usually but not always used in some form on the publications, possibly in addition to an imprint name. Normally has its own editing staff and policies, and its own acquisition policies and budget. May control multiple imprints. In some cases, an entity that was formerly a publisher may become an imprint, often as part of a business merger or acquisition.
Publishing group
A business entity that owns or controls one or more publishers or publishing groups, but does not itself act as a publisher.

The organizational structure of publishing is different enough in different firms, and has changed enough at different times, that no absolute rules can be made to separate these levels: assigning any particular entity to one of them is going to be something of a judgment call. But then, our goal is not to record all the twists and turns of corporate organization, but merely to record what the names printed on books mean; what you can tell about a book from those names, including using such names a clues to dating; what groups of books were produced by sufficiently similar editorial teams to be meaningfully compared; and to be able to match up physical publications, ISFDB records, and other bibliographic sources, many of which use some concept of "publisher" as one of their key fields. I think that the above three levels will suffice for those goals, and that trying to document not only every level in every corporate structure, but every name used for every level, will be counterproductive. What one corporate cluster calls a "Division" another may call a "group" or a "line". -DES Talk 20:11, 15 September 2008 (UTC)

I can live with three levels - maybe even two. I'm personally interested in imprints, which often tell me what to expect from a book: so I'd personally like Ace divided into SF, Fantasy, Horror, Gothic and Other, for instance. Doubles are a separate problem but I'm not sure how often a Double mixes those categories. If the consensus is to combine all Ace Speculative Fiction under one imprint I can live with that, but I'd prefer we separate the imprints (whatever they are) and have a vague level of "Ace" that combines them all for people less fussy than me. If that's a "Publisher" that's fine by me too, I don't really care: if we haven't separated the Imprints then I'll search for the Publisher. BLongley 23:14, 15 September 2008 (UTC)
Publishing GROUPS are very vague but might help me find the SF imprints used in various other countries: e.g. if an American was looking for the equivalent of "Bantam" 1960s/1970s books in the UK, he should be directed toward "Corgi". "Bantam" was also OK in the UK in the 1980s though. I don't actually see any harm in humouring Marc's experiment unless it starts interfering with our lazier, "unverified" Publishers - at worst any "publishing name" used will be a wiki page that nobody links to. And A verified source for something doesn't look like a bad idea - but as stated elsewhere, what I'd use it to verify is not necessarily what was planned. For instance, I found it useful for Cover Signature/Artist Credit cross-references - I can point people at a publication that confirms such. BLongley 23:14, 15 September 2008 (UTC)
I can't say I like your proposed definitions beyond "Imprint: A consistent name or line used in the marketing of books" and "an entity that was formerly a publisher may become an imprint". We COULD divert into registered Trademarks and official Company registrations, but I'm reluctant to do so right now - let's figure out what we each want from this, EXPLAIN IT, and maybe we'll start moving towards an agreement. BLongley 23:14, 15 September 2008 (UTC)
As the exact legal definitions, relationships, and number of employees dedicated to a particular brand are often not documented well in publications I'm not sure that we can use the definitions listed above to "define" the names though they are excellent descriptions of the distinction between imprint, publisher, and group. I suspect "Publishing division" will get added to the list of categories as the words "a division of" show up often in publications. For example, DAW Books is identified as a division of Penguin Group (USA) on their web site[1] and for years publications have said "DAW Books is distributed by Penguin Group (USA)". I'm not sure I want to add a category for "distributor" though that's exactly what's stated in the publications. The word does give a clue as to the relationship between DAW and Penguin where DAW gets to operate autonomously and Penguin provides the money/muscle/marketing/distribution.
Sometimes we just need to guess when deciding what something is and that's the value of documenting the source information. For example, I took a guess that Tandem is a Universal-Tandem imprint based on a statement in SHLDTFBFWF1978. Anyone else can see the source statement and could either interpret it differently or would know from other sources that Tandem was actually a publisher or division at the time and add corrections or their point of view. Marc Kupper (talk) 21:30, 16 September 2008 (UTC)
I would use "imprint" for anything that does not seem to operate as an efectivly separate publisher; I would use "Publisher" for anythign that does seem to so operate, whether it is owned by another firm or not; I would use "Group" for any firm that owns one or more publishers or groups, but does not seem to directly act as a publisher. i would not use "division" at all -- I strongly object to this sue, as it ties us to clsoely to the exact corporate structure which a) we cant document well, and b) we have no reason to care about anyway. i would call anythign above the level of 'publisher" a group, and i wouold call any entity that actually publishes books a "publisher" whether it is owned by another firm or not. Let's not make this any more complex thahn it needs to be. My "definations" above were intended to be more descriptions than anything else. If you reserve "publisher" for firms not separately owned and use "division" for ones owned by a "group" there may well be NO currently active publishers except for small presses: all major publishers are, i think, part of larger corporate entities in some way. Baen is probably the largest SF publisher that is currently "independant" in any reasonable sense. -DES Talk 21:39, 16 September 2008 (UTC)
I would probably, and very loosely, define "Imprint" as the name or phrase that can be inserted as the "fitb" between "A" or "An" (fitb) "Book" and still be clear as to what we're talking about. For instance, "An Ace Book", "An Arrow Book", "A Baen Book", "A Bantam Book", "A Corgi Book", "A Doubleday Book". This won't be perfect when names collide, so "A Magnum Book" probably needs a suffix on "Magnum" for all but the most used - and maybe even on that. "A Gollancz book" might be a bit vague if the book actually says "VGSF", and "A Daw Book" or "A DAW Book" is a regularisation issue. "A Tandem Book" matches my idea of an imprint (spine/cover logo etc) whereas "A Universal Book" doesn't. I find I'm becoming even less interested in "Publishers" or "Publishing Groups" or "Divisions" now. I'm happy to record such as part of this experiment, but it does seem that two people can look at the same data and draw three different conclusions. I'm more drawn to clarifying sub-imprints: e.g. Marc has independently identified "Tandem" but there may still be value in separating "Tandem Sci-Fi" and "Tandem Fantasy" and "Tandem Science Fantasy". BLongley 23:16, 16 September 2008 (UTC)
The only problem with that approach is that it would define many publishers as imprints. "A Harper & Row Book" works perfectly well. So does "A Baen Book", indeed many publications use that phrase, and IMO any system that classes "Baen" as an imprint rather than as a publisher is broken. For the matter of that, "Doubleday" may be an imprint now, but it was pretty clearly a publisher in, say the 1970s, so "A Doubleday Book" may indicate either a publisher or an imprint. I suppose you could say that such publishers were their own imprints, but they were also and primarily publishers. -DES Talk 16:31, 17 September 2008 (UTC)
I have no problem with a name being both - many of my favourite imprints have started as independent publishers and been absorbed by others. It's the changes that may be important to people - e.g. a takeover often means a different editorial team. For ISFDB purposes, knowing Imprint and Publisher relationships can be essential to understand printing numbers, which in the UK at least can carry over imprints and companies. What I don't want is separate records for say "Doubleday" the publisher and "Doubleday" the imprint: or even "Doubleday & Company, Inc." and "Doubleday & Co." unless there's some value in the separation. (Dating publications for instance). BLongley 19:17, 17 September 2008 (UTC)
And some imprints don't work well in this phrase: "An Ace Special Book" sounds rather odd to me, as does "A Ballantine Adult Fantasy Series Book". Having a consistent Logo/spine text/cover format and other marketing features marks an imprint, but may also mark a publisher not using an imprint (or with only a single self-named imprint, if you prefer to think of it that way). I am also less interested in publishing groups or other levels above publisher, but if people want to record these, i think that a single level name ("Publishing group" will do as well as any, IMO) should be used for all such entities. -DES Talk 16:31, 17 September 2008 (UTC)
For "Ace Special" I'd say "Ace" was still the imprint and "Ace Special" is a sub-imprint. I'd never say "A Ballantine Adult Fantasy Series Book", true, but nobody seems to want to include "Series" in the name so it's just "A Ballantine Adult Fantasy Book": a bit stilted yes, but not totally unreasonable. When the ending of a suspected imprint or sub-imprint is a word that is a reasonable replacement for "Book" but doesn't actually mean "Book" just try it without the "Book" - e.g. "An Ace Special", "A Ballantine Adult Fantasy". When the word is almost a synonym for "Book", drop it: e.g. "An Ace Novel Book" is ridiculous, but "An Ace Novel" or "An Ace Book" passes the test and "Ace" is a valid imprint. I said I would define it very loosely! ;-) BLongley 19:17, 17 September 2008 (UTC)
IIRC "Series" was part of the formal name of the Balllentine Adult Fantasy books, and was included on their copyright pages. -DES Talk 21:04, 17 September 2008 (UTC)
I think I own some, but as the DB says I haven't verified ANY they would be difficult to find. It looks a useful separation to some people though, see the references on the wiki-page. I think I added the "Do not merge" warning for such a reason but it wouldn't be a problem for me personally if the "Series" was added to the name, and maybe even if it was merged with "Ballantine" (or "Del Rey"?) it wouldn't be a problem if the "Publisher Series" was set up right. I'm definitely not going to do such though, I lack the data to make such decisions and the "Ballantine Del Rey" versus "Del Rey" should probably be sorted before any sub-imprints. BLongley 23:08, 17 September 2008 (UTC)
That said, your somewhat expanded test works reasonably well as a filter for what might be an imprint, provided that we remember that passing it does not rule out something being a publisher. Whether we call things like "Baen" or "Harper & Row" publishers, or both imprints and publishers is a matter of terminology, and i have no problem if we call them both. I'm not sure that I see much value in the notion of "sub-imprint", such a thing seems to me merely a different but related imprint -- the relations may not be strictly hierarchical. -DES Talk 21:04, 17 September 2008 (UTC)
I definitely see a value as I buy Science Fiction mainly, very little Fantasy, and almost no Horror. And I do occasionally aim for all the (affordable) SF by Imprint/Publisher and if they distinguish such in any way it helps me. We have "Hamlyn" here, a lot of them verified by me. I noticed during this experiment that almost all the Hamlyn books I own have "HAMLYN SCIENCE FICTION" on the front cover - if it turns out that there's "HAMLYN FANTASY" or "HAMLYN HORROR" too I'd like those separated and don't mind the current "Hamlyn" being divided. Similar with children's imprints: a lot of adults won't buy a kid's book, and would want a warning that a sub-imprint could provide - e.g. "Yearling" seems to be a good warning, although the current verified name doesn't make it clear that Corgi also used such a name. I think by definition a SUB-imprint has to be hierarchical, but the hierarchy can and sometimes does change over time - for instance, Methuen's Children's Books division was sold to Egmont, and presumably all its imprints and sub-imprints went with it. (What they were, I can't really say - I'm not averse to buying "children's" books even now, but they're a small proportion of my library.) BLongley 23:08, 17 September 2008 (UTC)
There may perhaps be value in distinguishing between "HAMLYN SCIENCE FICTION", "HAMLYN FANTASY" and "HAMLYN HORROR" (assuming that they all existed), but If so, I would simply call those "imprints" and not try to distinguish what is a "full imprint" from what is a "sub imprint". Personally, however, i wouldn't trust an imprint name to usefully distinguish for me between science fiction and fantasy in making purchasing choices, i would want a synopsis, review (even an amazon review would be enough for this purpose), or other guide. In some cases the author's name will do it, in others it won't. -DES Talk 23:29, 17 September 2008 (UTC)
In ISFDB terms, there's often (but not always) little value in separating imprints and sub-imprints and creating a hierarchy. There's also value in separating some BIG conglomerations into smaller units that can be checked. People will always have their own ideas about what they'd buy anyway. I find some of the distinctions useful - do I want to search for "Scholastic" and see if they're all worth buying? No. Would I search for "Point"? No, but it would get me a shorter list to check. Would I search for "Point SF?" Maybe. There's a level of information that people might find useful, and this might vary by publisher or imprint: so while I might find Ace usefully separated into SF/Fantasy/Horror/Gothic as they published a lot, Hamlyn paperbacks that we have HERE are so far mostly SF in my experience - well, we wouldn't have the non-genre Hamlyns anyway - we're a bit self-selective. Whether what Hamlyn calls SF matches my idea of SF is still to be determined - I'm not buying all Hamlyn on sight, but am buying Corgi SF and Panther SF when I see it at a car-boot sale, even if it's a duplicate of something I already own. (I know - "overpaid IT consultant, money to burn" - not true, but I can spare a few coins for research material.) I don't want to be prescriptive about what level we should record data - if I was forced too, then I'd point at the "100 entries" limit search gives us. I really don't want such separated down to "these books say 'Corgi SF' and these say 'Corgi Science Fiction' and these just say 'Corgi'" though. BLongley 00:01, 20 September 2008 (UTC)
I would be inclined to focus on such things as having a distinctive logo, a distinctive cover layout or design, perhaps a distinctive spine design, etc, along with the name, rather than on just the name, and whether it ends in a term that can be used as an adjective (and so be followed by "book") or one that can be treated as a noun (and so replace "book"). But that is a difference of emphasis, and not, i suspect, in most cases, a difference of result, or much of a difference of intention. In short, i think we are in something close to agreement about what an "imprint" is. I suspect we are in reasonable agreement about "publisher" also. Like you, i care less about levels "above" publisher. I simply think that multiplying such levels, and trying to carefully arrange them in a neat tree structure, is probably foredoomed, and offers at best limited benefit to the ISFDB. Thus I would decide in advance to stick to only one such level, unless good reason for changing that was brought foreward. -DES Talk 21:04, 17 September 2008 (UTC)
I think there ARE tree structures, but they won't be "neat", as they will vary by date and nomenclature. But there's no reason to "multiply levels" here and having ONE node for "Doubleday" or "Transworld" or any other significant name is fine by me, all variations can be recorded on ONE page and the transitions from "publishing company that was its own imprint" to "publishing company that had several imprints" to "subsidiary company of someone else that still had some imprints" to "division of someone else that decided to consolidate, retire or create imprints" to "this is a defunct imprint, it is no more, has ceased to be, has curled up its tootsies and shuffled off this mortal coil, etc" to "oh wait, people MISS that name, let's resurrect it" can be recorded there. BLongley 23:08, 17 September 2008 (UTC)
I suspect that in some cases some nodes have multiple parents, but that may be mwerely confusionj of multiple eras. But even if such corporate relations are all in fact trees, trying to reconstruct the structure in full is, as you say not very useful. Noting the various roles assigned to any given name is I think more helpful. A role you don't mention is "owned one or more publsihers or imprints (directly or indirectly), but did not directly act as publisher or imprint" I think that "Charter Communications" and "Gulf+Western" fit this description. -DES Talk 23:29, 17 September 2008 (UTC)
I agree that you and I are mostly in agreement, and frankly probably have been for ages: we're just so vocal that any minor DISagreements, or requests for clarification, might make us look more argumentative than we actually are. :-/ BLongley 23:08, 17 September 2008 (UTC)
Quite probably true. Now if we can just persuade everyone else to accept our obvious good sense. ;) Thanks for discussing. -DES Talk 23:29, 17 September 2008 (UTC)
I'm always willing to discuss, but time and commitments interfere with prompt replies. Our lack of subservient minions dealing with the mundane is severely hampering our ISFSB-domination though. ;-) BLongley 00:01, 20 September 2008 (UTC)

Discussion location

By the way, perhaps some or all of this discussion should be moved or copied to Talk:Verified Publishing Names? -DES Talk 20:11, 15 September 2008 (UTC)

Probably, but we're not very good at finding discussions. :-( Copy for now, and Marc can kick this out of his talk page when we find somewhere when we can remind people that there ARE discussions going on outside the main Community Portal areas? BLongley 23:14, 15 September 2008 (UTC)

Organizing articles by date and greographic location

In the recent stub articles on publisher's etc. I've been using [[]] around dates as in [[1978]]. I thought there was something wonderful that MediaWiki could do with this such as an automagic category for any article that mentions 1978. Inspecting what happens when I use [[1978]] on Wikipedia finds that they go to manually maintained pages.

Do people think it would be of value to add a template, {{Active|1978}} for example, that drops pages into something like Category:Active 1978? We have categories that organize by publisher, imprint, etc. and I was looking to also organize by date. We could also organize by decade such as {{Active|1970s}}.

In a similar vein is location categories. {{Located|Great Britain}} or even {{Located|London}} which would transclude {{Located|Great Britain}} which transcludes {{Located|United Kingdom}}, etc. Marc Kupper (talk) 20:52, 16 September 2008 (UTC)

If MediaWiki isn't giving us anything automagic with dates, I'd drop the practice unless someone's willing to do the manual maintenance - I'm not. BLongley 21:32, 16 September 2008 (UTC)
As to location categories - given those examples I seriously recommend AVOID. A) Most of us don't even know what "transclude" means and B) any attempt to categorize GB within UK is doomed (it's not a subset, but most residents don't even know that) and C) when you start trying to work with Northern Ireland or the Republic of Ireland categories on the internet you will attract flame-wars up to and including personal death threats. BLongley 21:32, 16 September 2008 (UTC)
Media wiki authoformats dates according to a user's prefernces when, and only when, a full date is supplied, like 1 January 1991, or January 2, 1991 or 1991-01-03. Furthermore it doesn't seem to be turned on for this wiki anyway. It also causes red links on smaller wikis like this, and pointless blue links on wikipedia. The Wikipedia manual of style now reccomends not doing this in most cases. In any case, a year alone is never reformated for date preferences in any way, and unless we create articles about particualr years, has no value at all. -DES Talk 21:46, 16 September 2008 (UTC)
I think year by year categories for active dates would rapidly become a nightmare. I'm not sure that even decade categories would wind up being useful. Let me think about how this might be done automatically or semi automatically but also usefully.
I tend to agree with Bill about geographic categories: don't. -DES Talk 21:50, 16 September 2008 (UTC)
You might want to consider Category:Active 1970's so on and so forth. That would limit it to 16 or so heavy hitter decades back to 1850 and then switch to 50 or 25 year blocks before that, or 21 (soon to be 22) decades if we stretched the decades back to 1800, and then moved to 25 year wide blocks. That would certainly be more maintainable than ~160 categories just to reach back to the beginning of Vernes publications in 1851. (Was a more salient comment but DES's comment went in while I was typing) Kevin 21:56, 16 September 2008 (UTC)
In the end, i don't think that would really turn out to be helpful either -- there will still be lots of manual work to extract anythign like "Active from X to Y". I think there may be a better solution, let me think a bit. if you do go for dacade catagorys, though, please make them "1970s", not "1970's" The apostophe to pluralize numbers and appreviations is simnply wrong. -DES Talk 23:14, 16 September 2008 (UTC)

Regularizing or normalizing publishing names

It is my view that making distinctions, even in wiki pages, on such differences as "Ltd." vs "Ltd" is perverse and pointless (no pun intended). Indeed I would be inclined to standardize right now on punctuation (all abbreviations get periods, would be my view) and on other such trivial differences, and routinely change any wiki or DB entry that differs from those standards. (I recall that we had earlier agreed that when a publisher name included initials ultimately derived from a person's name -- like "William R. Morrow" -- all such initials would get periods, and all such periods would be followed by spaces, just as we do for canonical author names.) Frankly i would hope that we could soon standardize on the total omission of such suffixes as "Ltd.", "inc.", and "PLC". The only value I see in them is that they sometimes give hints on nationality: "Ltd" is likely to be a UK or perhaps commonwealth firm, and I think "PLC" is mostly from Australia, while "inc" is most often from a US source. But once a given publisher is established as to its national origin or current operating HQ, these add no further value IMO. -DES Talk 16:19, 17 September 2008 (UTC)

I could agree on "all abbreviations get periods" except that that still splits the abbreviated from the unabbreviated. I did suggest regularisation of people's names used in Publisher names in the same way as we do for Author's names - I'm not sure we ever totally agreed on that but I saw no DISagreement so HAVE been doing that. I have dropped "Ltd" or "Ltd." or "Limited" suffixes from some verified names I've created, but there seems to be some support for distinguishing the same publisher/company/imprint in different countries so people have for instance CREATED "Bantam UK" and "Bantam Books (UK)" although no such name exists in the publications. (I'd think that the Currency Symbol on the price was enough, but it seems that's not enough for some people.) So we could remove a suffix like "GmbH" only to replace it with "(Germany)". (By the way, we have "PLC"s in the UK, and "Popular Publications, Inc.; Toronto, Canada" suggests that "Inc." won't distinguish US and Canadian editions, and at least one active Publisher Regulariser likes that distinction.) BLongley 19:41, 17 September 2008 (UTC)
Well, I did say that such abbreviations might help determine nationality, obviously thy can't be the only or final word. But "inc" does pretty much rule out Europe, and "ltd" pretty much rule out the US (and maybe Canada?). They might also help for older books where we have no price data. The one real use I can see for such abbrevs is that they may mark a turning point in a firm's history: a change from "Jones & Sons" to "Jones & Co." may be meaningful. I don't see any real point in country suffixes unless they disambiguate otherwise similar publisher names. There is no reason for "Harper & Row (US)" or "Baen (US)", because there were/are no non-US publishers of those names. "HarperCollins" is a different matter. I am open to discussions about regularization rules (although not, i trust, endless discussions) but I think we should start applying as many such rules as we can agree on, both in the DB and the wiki, at once. For example no actual publication that I recall uses the form "Publisher/Imprint" -- that is purely a convention of ours (also used by other sources, of course). So we can make an arbitrary decision whether the publisher or the imprint comes first, and convert all existing records to match. Similarly, we can decide whether it is to be "Imprint/Publisher" or "Imprint / Publisher" (I favor the latter, but would accept either provided that we settle on one). Such decisions in no way reduce our data on who actually published a book, while they would help to reduce fragmentation in the db, which i continue to regard as a good thing. -DES Talk 21:31, 17 September 2008 (UTC)
What's an Ltd Pty in terms of nationality? :-)
From this experiment - it's Australia or New Zealand from my data so far. Not that I'm suggesting we derive nationality from such, rather the opposite. BLongley 00:44, 18 September 2008 (UTC)
I'm in favor of using the exact name as stated for verified names as it removes "thinking" from the process and also makes it easier to spot if something's in error should someone be comparing their source pub against what's on the wiki. The downside is that it will lead to fragmentation if a publisher is not consistent though discovering that inconsistency may well be of value. Being anal about precision in ISFDB publication records has help us discover "lost" names and/or why certain publications/stories were known under a variety of titles/authors/catalog #s, etc.
Please give an example of discovering a "lost" PUBLISHING name. BLongley 00:44, 18 September 2008 (UTC)
If people start normalizing and adding decorations like "(UK)" to the verified names it introduces uncertainty as to what the real name is. I know it's happened to me more than once where I've wondered what a particular publisher's name was in terms of deciding which version should be canonical. Is it "Doubleday & Co", "Doubleday & Co." or "Doubleday & Company"?
I am slightly concerned about adding suffixes to "Publishers" that aren't really present. I suggested months ago that Al should have given us a separate field on each publication record, so we could regularise one and have the other as an "as stated" field. BLongley 00:44, 18 September 2008 (UTC)
As this project has made me pay attention to wording I've already spotted a publisher inconsistency with someone using both both "New York, NY" and "New York, New York" though in that case the first was the publisher/division address and the second was the publishing group's address. I learned that people in the UK seem to not use a period with "Ltd" and that "Penguin Group (USA)" is a "real" name and not simply someone's personal attempt to distinguish one Penguin from another.
Thus I'm leaning towards that any verified name be exactly as stated, including those pesky periods, and that other names are either canonical, have not been verified yet, or are noise. Marc Kupper (talk) 22:01, 17 September 2008 (UTC)
And I'm leaning towards abandoning this experiment if you're really THAT anal, or are making changes to suit your views. You have messed with pages that I submitted as verification sources. See the edit history for Publication:STRHNTCKCK1985 for instance. Yes, it's prettier. Is it what I submitted? No. Some of the verification sources I submitted are NOT actually centred on every page, they're occasionally left-aligned. My simple, messy, "add a space and get the boxes to appear" worked fine for me, and also more accurately represented what is shown in my pub. Can I rely on you to NOT decapitalise a phrase to get a "vn" link to work? I'm not sure now. I know I came to this project with some pretty firm views as to what I want, for the small area that actually interests me, but I've tried to record exactly what I actually HAVE. BLongley 00:44, 18 September 2008 (UTC)
I truly do not think that such tiny, trivial differences have any value for the ISFDB, and i do not plan to start recording them. Honestly, what practical value is there in noting that one address is printed as "New York, NY" and another as "New York, New York"? None. Indeed I would rather work on regularizing and merging existing variant names of what are clearly the same publisher. If you want to find out what a publisher's "real" name was, outside research, looking in corporate histories and old business directories (available at any good research library) are IMO far more likely to produce a definitive and accurate answer. But in many ways, I'm not convinced that that matters. Knowing when publishers merged is of value. But as between "Doubleday & Co", "Doubleday & Co." or "Doubleday & Company"? we should simply pick one, and it doesn't much matter which. The firm is the same in each case, and all three were actually used by the company or by reasonable secondary sources or both. I would incline to pick "Doubleday & Co.", or probably simply "Doubleday" if research does not indicate that the presence of the "& Co." usefully distinguishes the date of a publication. -DES Talk 22:23, 17 September 2008 (UTC)

(unindent) I'm going nuts trying to follow these multiple indented threads. The "reason" for the analality is that I've screwed up the publisher name field in my personal db, and I see the same has happened in the ISFDB publisher field, in that we filed things like Ace Books, Inc. (the publisher), Ace Books (the imprint of Charter), Ace Science Fiction Books (another imprint of Charter), and I believe there is Ace Publications, all under "Ace" or "Ace Books." An example of a "lost" publishing name is Ace Science Fiction Books where even though it's on dozens of my books and I'm sure among all of us we have hundreds or thousands of these pubs there were zero publications in ISFDB.

Many times I've looked at an on line record, ISFDB and elsewhere, seen someone mention a publishing name, and I'm thinking "that does not look right." The verified names project is an attempt to construct a database of both the literal names publishers have stated and their relationships. The wiki is a nearly perfect tool with the main loss is it's not easy to get it to do things like "Tell me about publishers active in London from 1960 to 1969." However, once the names project matures we should have a decent data collection that could be dropped into a database. When things are in a database there's a usually a need to regularize and normalize names so that we can do those cross reference queries. Regularization and normalization don't gain you anything in a wiki. Another thing is that wiki offers redirects meaning that that if article filed under an irregular name it's easy to point it at the regular or canonical name. There's no need to tell people "use the regular names" in a wiki.

I also want to remove "thinking" from publication related stuff. Entering and verifying exactly what's stated requires no "thinking." Sure there is paying attention to detail but I don't want to get involved with defining and discussion rules such as when we should or should not use a comma and which special words such and Inc., Ltd., Pty, Co., should or should not be included, spelled out in full, abbreviated, etc. A while back I wondered why I'd see on the Internet HarperCollins. Who in the real world uses camel-case and should that be normalized to something like Harper-Collins? Any time we start defining rules on how to format things we start listing all the exceptions and usually that's longer than the rules.

Who in the real world uses CamelCasing? HarperCollins, PlayStation, TiVo, FedEx, BlackBerry, MasterCard, iPod and more than a few more. In the "unreal" world: eBay, MySpace, YouTube, and SpongeBob. And don't forget MediaWiki. With the advent of the nonspacing URL, there's bound to be even more in the future, real or unreal. MHHutchins 21:53, 18 September 2008 (UTC)

I see normalizing, regularizing, and cleaning up names as entirely outside the scope of verifying what what the actual names are and documenting their relationships.

The trailing Inc., etc. is important and should not be summarily dropped. For example, "Ace Books, Inc." was a publisher. "Ace Books" is an imprint. Someone "cleaning up the record" by merging the two into "Ace Books" is tossing the record of that publisher.

The business of the periods at the end is nit-picking but normalization involves defining rules, exceptions, and thinking and I'm trying to keep those as short and simple as possible. I am more concerned about normalizing things like "Faber and Faber" vs. "Faber & Faber." In ISFDB it seems there's a 50/50 mix. Does the publisher really use both names or did they change at some point? We'll never find out if we clean up the names.

Adding suffixes should be rare but the (parenthetical method) seems to be common and well understood. Publishers do it too as in Penguin Group (USA). In the wiki it easy to make it clear when a "(USA)" was added by us to disambiguate vs. that we are documenting what a publisher states. It's sometimes confusing on the ISFDB side though using title notes can help.

Bill - I'm sorry about messing with your pub record - I thought you had asked for feedback on what you had done. One annoyance with the wiki is that if someone enters two adjacent lines that the wiki/HTML formatting tends to wrap them up into a single line. You were forcing line breaks with a leading space. FireFox displays blocks of space-indented text by wrapping them in screen width blue/gray boxes that have a dotted blue border. It was like trying to read what's on a zebra or tiger and it was also harder to spot the title vs. verso page stuff. What I did to the publication makes the text look much worse in wiki-edit mode but the displayed results seemed cleaner though may not have accurately followed centered vs. justified text as that was not clear from how you had entered it. Another way we could do it is to use <pre> but then we loose the ability to link from words in the text using [[]] or {{}}.

I found a way, though not apparent to someone editing the page, to get rid of the tiger stripes and so reverted back to your original text. Instead of blank lines between sections of there's a single space on each "blank" line. That creates a single blue/grey box per page rather than stripes. I see you did a follow-up edit to center everything on the copyright page. I'll leave it up to you on which version of the page you want. Marc Kupper (talk) 09:19, 18 September 2008 (UTC)

I woke up this morning and the first thought was "I already am regularizing..." If I see "CORGI" I think/type "Corgi" and likewise, with the book in hand that says "CORGI SCIENCE FICTION" on the cover, "CORGI BOOKS" on the cover, etc. What's that blob inside the O?
It's a Corgi with a book in its mouth. See the publisher page for the history of the logos. (It's messy, I know - suggestions welcome.) BLongley 19:12, 18 September 2008 (UTC)
Thank you - that saves spinning up the magnifier lamp. I took a shower I was thinking
  • About how some of the state government sites I looked at about Inc/LLC etc. use the "name" part as the distinguishing component and not the "Inc." suffix.
  • That states ask that punctuation not be used. For example, New York which is home for many publishers has "The state guidelines of New York require that you do not use any type of punctuation such as dashes, hyphens, periods, commas, underscores, quotations or exclamation points in the name selection."[2]. Other states I looked at had similar if not identical wording.
  • That some publications do not identify the nature of the company. For example, I have a book that says "Arbor House New York" on the title page and says it was published by "Arbor House Publishing Company". Without further research we don't know if "Arbor House Publishing Company" is incorporated, an LLC, sole proprietorship, etc.
  • That the verified publisher & group name categories will end up with a random mix of suffixes. Documenting the legal form of each company by including it's suffix(es) does not add value to the category.
With those in mind I'm thinking verified names would just be the "name" part and that in the articles we'll have something like a "Full names and historical addresses" section that will document how the publisher has spelled out their name/address over time.
The regularization would be:
  • If a name is stated in ALL CAPITALS then convert it to title case unless it's known that the entity always uses all CAPITALS. (link to existing ISFDB rules on titles). Ideally
  • Do not include any suffixed "Inc." "Ltd" "LLC" "Pty" "Incorporation" "Corporation" "Limited" etc. in the name. These terms define the legal nature of the company and are separate from its name. The full name including any suffixed terms should be documented in the entities' article.
  • If a name includes any of the following abbreviated words then these should be expanded. The exact form of the name should be document in the entities' article.
    • Co. -> Company
    • Bros -> Brothers
    • Note that I decided to not generalize this last item as there are names such as DAW that are abbreviations that should not be expanded. Marc Kupper (talk) 05:30, 19 September 2008 (UTC)

Regularizing "inc"

Regularizing "inc" etc. makes me more nervous as we have all of these for the same company

Original name BLongley (Cautious mode) BLongley (Aggressive mode) Marc
Name, Inc Name, Inc. Name Name, Inc.
Name, Inc. Name, Inc. Name Name, Inc.
Name, Incorporated Name, Inc. Name Name, Inc.
Name, Corp Name Corp. Name Name, Inc.
Name, Corp. Name Corp. Name Name, Inc.
Name, Corporation Name Corp. Name Name, Inc.
Name Inc Name, Inc. Name Name, Inc.
Name Inc. Name, Inc. Name Name, Inc.
Name Incorporated Name, Inc. Name Name, Inc.
Name Corp Name Corp. Name Name, Inc.
Name Corp. Name Corp. Name Name, Inc.
Name Corporation Name Corp. Name Name, Inc.

Have I missed any? For example, in the USA I believe a corporation may also use LLC / Limited Liability Corporation.

I need to consult an English manual of style to see if "Name, Inc." or "Name Inc." is correct though I tend to use the former. I do want to record that the name is "Name, suffix" as it is often an entirely different legal entity/construction than just "Name." "Ace Books, Inc." the publisher and "Ace Books" the Charter imprint is an example.

After beating "Inc." to death we'd need to do the same to for Ltd/Limited, PTY/Party, and constructs like Ltd Pty. Marc Kupper (talk) 18:52, 18 September 2008 (UTC)

This is either a very good example (I'm untainted by too much prior knowledge) or a very bad one for me (can't check my thoughts against real publications). I only have one ", Inc" book and that should be an ", Inc." - but I was alone in that and I'm going to change it to the simpler imprint anyway. (Don't worry, I'm holding off on big publisher updates while the experiment continues, but I'm certainly correcting my own verifications still.) All my other ", Inc."s are magazines and I'm not going there. I wouldn't hesitate in merging "Name, Inc" into "Name, Inc.", "Name, Corp" into "Name, Corp.", "Name Inc" into "Name Inc." and "Name Corp" into "Name Corp." though. And yes, we have LLCs in ISFDB. BLongley 19:52, 18 September 2008 (UTC)
Has anyone tried to determine that any of these variations have any actual useful meaning? I am inclined to doubt it. Most manuals of style will say that by default both "inc" and "Corp" should have periods, but that a particular firm's house style or formal name trumps any general rules. Had I noticed this before these discussion began, i would have merged it on sight, and not worried that any useful info would be lost. Now I will do some research, and post results, and then consider, (or else just leave Corgi alone for the moment)
On the merits of preserving and documenting minor variations in publisher names, i am not yet convinced. I would like some answer to the question "what benefit do you expect to reap through this activity?" Above Marc (or was it Bill?) says that having multiple forms in the wiki is harmless. I agree that it is much less of a problem than in the db, and if most of those are or swiftly become redirects, there is no harm at all. But if there are multiple pages for different forms of the name of a single firm with substantive text/data on them, they will soon become out-of-sync, and no one going to one such page will know if s/he has all the info available, or if other forms need to be looked at. That way lies confusion, duplication of effort, and reduced usefulness of the wiki pages. I will agree that some variations may well be significant, possibly marking a change in organization of a firm, or in the case of "ltd" vs "inc" (or the like) marking related (or just similarly named and perhaps once related) firms in different countries. But I cannot imagine, and have not read of here, any plausible real-world significance of the presence or absence of a period or comma. -DES Talk 21:19, 18 September 2008 (UTC)
Bill, If I understand what you wrote correctly the only normalization or regularization of "inc" is you'd like to do for verified names is to add a period to the end if it's missing from an abbreviated form of the name? I changed the bullet of forms into a table so you can update the second column to be the regular form of the name. Marc Kupper (talk) 19:40, 19 September 2008 (UTC)
No, that's the MINIMUM I'd do if I was still regularizing database publishers, and I'd happily do it without hesitation or discussion. After that, I'd be checking for active verifiers of one variation or another, and would probably fold an entirely unverified name into a similar one with verified pubs. Or if there's vastly more of the unverified one then I'd ask the verifier if they minded a change to the more common version. If pushed further, I'd probably have just one Inc and one Corp (I have no knowledge of whether they mean the same thing or not): I generally prefer shorter names so am happy to have abbreviations, but I have a vague feeling (I'm not sure from where) that a Corporation is usually "The Name Corporation" and an Incorporated Company is normally "Name, Incorporated" so I'd have "Name Corp." but "Name, Inc." But ideally, if there's no useful difference (e.g. if they changed from one form to the other at a particular date), just "Name" is my preference. BLongley 19:59, 19 September 2008 (UTC)
I've been very aware that I'm not the expert on worldwide publisher name conventions so I've just been reducing the number of forms of each publisher name, in the hope that once we had a significantly reduced number of regularized names we could then consider some standardized names. In the short term it doesn't matter too much if one publisher ends up as Inc and another as Corp and another as Incorporated and another as Corporation so long as we have unnecessary variations minimized. But KEEPING the unnecessary variations to a minimum probably requires agreement on Standardizing names. Ideally, an editor would have a drop down list of the most common 'approved' publisher names to guide them, and if they STILL choose an uncommon one or a new one then moderators will know the rules about new ones, and have access to all the existing uncommon ones. A drop-down list of 8000+ publishers is unfeasible though, and having no way to add new ones would be unacceptable too. But splitting the work so that Editors are usually helped automatically and Mods have guidelines for the special cases should be possible. I even think we could have some automatic conversions for known Librarian's conventions and Amazon abbreviations. Still, that's a set of programming changes and such can't be accomplished until we have the data: which names we approve of and which we don't for a start, and maybe a few conversions. (E.g. if somebody enters exactly "London: Panther" I would like automatic conversion to just "Panther" rather than reject it or send it to a Mod for review.) BLongley 19:30, 19 September 2008 (UTC)
DES - I'm pretty sure all the forms are the same and that we could fold the whole thing into "Name, Inc." (with the period) other than perhaps LLC which seems rare in the USA. Marc Kupper (talk) 00:14, 19 September 2008 (UTC)
I just added how I would regularize "inc". Marc Kupper (talk) 00:29, 19 September 2008 (UTC)
I would be much happier if we agreed that all variations of "inc" could be mapped to "Name, Inc." (or "Name, inc." or whatever standard form we prefer -- i think style manuals suggest that "inc is lower case by default). I would be even happier if all such were simply mapped to "Name". I doubt that we will ever find both an "inc" and an "LLC" (or an "ltd") for the same firm, and if we do, I rather doubt that the difference will be significant. So i propose the following:
Original name DES Marc Bill Kevin
Name, Inc Name Name Name Name
Name, Inc. Name Name Name Name
Name, LLC Name Name Name Name
Name, Co. Name Ambiguous - we don't know if this was an error and should be "Name Company", is "Name, Company", or "Name, Corporation" meaning we don't know if it should be regularized to "Name, Company", "Name Company" or "Name". I would look elsewhere in the publication and if needed, other sources, to resolve this. We can add notes to the verification source publication article that explain the research/reasoning that went into how we regularized this name. Name Ambiguous - See Marc.
Name & Co. Name & Company Name & Company Name ideally, or justify why "and company" is needed. Name & Company (& Company is always required to separate it from 'Name, Co.' etc that may confuse a new editor)
Name Publishing Company Name Name Publishing Company Name Name Publishing Company
Name Books Name Name Books Name, unless suffix is needed for disambiguation Name Books
Name Press Name Name Press Name, unless suffix is needed for disambiguation Name Press
Name & Bros Name & Bros. "Name & Brothers" or "Name & Bros." - "Name & Bros." would be better if a publisher is consistently using that form. If then use both forms then use the longer version. Name & Bros. "Name & Brothers" or "Name & Bros." - See Marc
Name & Son Name & Son Name & Son Name & Son Name & Son('s)/Newphew('s)
Name, a division of OtherName Name "Name" and "OtherName" would be two separate verified names with their respective publishing name articles documenting the relationship. Name. "Name" in the wiki would link to all OtherNames worth recording, even if ISFDB doesn't link to those entries directly. Name, unless Othername was not a Publisher/Group (think large multinational, someone who owns a publishing 'division' and might list it as "Great Press, A division of Standard Oil" - Odd but it could happen
Name, an imprint of OtherName Name / OtherName "Name" and "OtherName" would be two separate verified names with their respective publishing name articles documenting the relationship. Name. "Name" in the wiki would link to all OtherNames worth recording, even if ISFDB doesn't link to those entries directly. Name, (with wiki links to othername) except where a single imprint is shared by multiple publishers - See Viking in the DB.
Name, City/country Name (with location data in the wiki) or
Name (Location) if needed for disabiguation
"Name" and the article can document the city or country. Name, City/country. The wiki would explain WHY this is needed for disambiguation. If the suffix isn't needed, just lose it. Name (with location data in the wiki) or
Name (Location) if needed for disambiguation
City/country, Name Name (with location data in the wiki) or
Name (Location) if needed for disabiguation
Destroy on sight and replace with Name or Name, City/country. Name (with location data in the wiki) or
Name (Location) if needed for disambiguation
Name / SFBC Name / SFBC
I trust the pattern is clear. -DES Talk 16:19, 19 September 2008 (UTC)
Your pattern seems clear. I updated the table to add a column for "Marc's pattern" as I consider "Publishing Company", "Press", and "Books", suffixes to be part of the name. For example the statement "Arbor House books are published by Arbor House Publishing Company" would indicate that "Arbor House" is the imprint name and "Arbor House Publishing Company" is the publisher's name. In informal conversation we would use "Arbor House" to refer to the publisher but that's not it's full name. "Baen" is the same way where it's an imprint of Baen Publishing Enterprises. "Baen Books" is sticky as the publisher uses it in phrases like "A Baen Books original" meaning we don't know if it's an imprint or they are referring to themselves the company. I'm thinking of proposing a rename of Category:Logos anyway to something like "Logos and service names" to cover names like "Baen Books" that we can verify as being used by a publisher but not as an imprint.
I added one more row to deal with names that have city/country stuff. Marc Kupper (talk) 19:40, 19 September 2008 (UTC)
I added one for an abomination I want to destroy on sight - current searches and sorts mean we need most useful info first, disambiguations last. BLongley 22:40, 19 September 2008 (UTC)
I quite agree in hating location prefixes. -DES Talk 22:51, 19 September 2008 (UTC)
I could quite happily live with Marc's pattern above, even though it is not my first choice. I could see arguments both ways on "Books", "Press", and the like. Yes they are part of the legal name of the firm. But once you are regularizing at all, at least some canonical names will not be the firm's formal legal designations. Therefore, as long as the designations used are distinct, distinctive, understandable to editors and users, and capture the key term or terms of the formal names, failing to match the full formal name seems a non-issue to me. If "Arbor House" or even "Arbor" is what is most commonly used in informal conversation, that IMO that is the best name -- no one speaks of "Arbor House Publishing Company" except in formal, mostly legal, documents. Also, trying to assume that with companies like Baen there are in some meaningful sense both a Publisher and an Imprint, and that these are somehow separate and must have separate names, is IMO badly mistaken. There is one entity here, that functions as both a publisher and an imprint. If they started having separate logos for, say "Baen Fantasy" and "Baen SF" those might be imprints (or at least lines), but, to date, they don't do that. That said, if the consensus favors Marc's ideas as expressed above, I wouldn't feel too put upon. -DES Talk 20:43, 19 September 2008 (UTC)
I'm still leaning toward "imprint" or "sub-imprint" to be the name of choice when recording a publisher on an ISFDB pub record. I'd LIKE there to be another "As the editor wants to record it" field available, but in the meantime I can live with that being in pub notes. The useful (IMO) groupings of sub-imprints with imprints SHOULD be recorded in the wiki, but ideally will be available for searches in the database in the long-term - e.g. so I can search for "Tandem Science Fiction", my girlfriend can search for "Tandem Fantasy", and people less choosy can search for "Tandem" and find them all. If we haven't needed to separate sub-imprints, fine: I'll just search for "Hamlyn" rather than "Hamlyn Science Fiction" and know there really wasn't a "Hamlyn Fantasy" sub-imprint. BLongley 22:40, 19 September 2008 (UTC)
I mostly agree, except that I do think that using "Imprint / Publisher" in the db is often a good idea -- it allows a search on "Publisher" to find all its imprints. -DES Talk 22:51, 19 September 2008 (UTC)
Above that level, we are in wiki-only territory for now (and maybe always will be) and there's a lot of disagreement about what's worth recording and/or separating. So whereas I'm happy with "Transworld" existing, and will use it, I'm not too fussy about every variation of the name and whether it was a division or a company - those can all be recorded on the same page. If Marc wants to record every single verified sighting of every variant spelling, abbreviation and typo, too, I have no problems with that - I'll always be using the links from ISFDB, or links from those pages to their parents that someone felt was needed, and then maybe down again. I won't be bothered with loads of "verified publisher/group/imprint" pages that aren't in that hierarchy, or can at least be left as "also see..." links to "hey, it was spelled 'Dobleday' in THIS pub!" stuff. I can understand that the more wiki-capable people will worry about fragmentation there, and I'd generally support some regularizarion/standardization there too.
I'm really liking the idea of using Publication Notes to record exactly why a certain pub is a verification source for a certain piece of data: I've used them very little before, but now I've run through a few publishers I find they're good for Artist Sig identification (if you care about Artists), Publisher addresses (if you care about those - I suspect we might spot some ownership details from such if two imprints turn out to be at the same address at the same time), Logos (might be useful for dating, and they're not always the cover logo visible on a Coverart upload, they might be the spine logo or title page logo). I'm still wary of "every variation" records for a publishing name that record every time a period was omitted or such. I'd happily leave those at publication record level, and USEFUL ones can get accumulated in the wiki. But I'm not generally using the Wiki for such, and even though Marc is I'll leave those problems to someone else. BLongley 22:40, 19 September 2008 (UTC)
Bill, I noticed a couple of times you used "Name, unless suffix is needed for disambiguation." I would agree with that if "name" is being used in the database but my thinking for the verified names project is to capture the full name that is being used by publishing companies. If there is a name collision, let's say with the name "Pocket" then the Pocket article itself would be tagged as a "verified name" in the appropriate categories and the page would say that the name was used by two separate companies with links to either #sections or separate disambiguated articles that both should mention that this is the article for X and that there's another Pocket over at Y. On the database side we are already disambiguating using (stuff in parentheses) plus that seems to be the preferred method over on Wikipedia meaning we can use the same here and it'll be understood by most people.
I believe once we start recording and verifying the full names the publishers are using that there will be few collisions. Part of this is that often groups, publishers, and sometimes divisions are usually legal entities in themselves and the rules for registering a company name is that it can't be the same as another company name nor can you create a name that could be confused with someone else's name. Thus the companies will tend to have unique names. The collisions we are likely to see are companies with the same name in two different countries or in two different eras (time wise). Marc Kupper (talk) 23:00, 19 September 2008 (UTC)
I could see lots of point to the wiki page Publisher:Pocket recording the various ways in which the name "Pocket" was used, and by what entities: Say "Pocket Books", "Pocket Library", "Pocket Childrens library", "Pocket SF", "Pocket Publishing, inc" and the like (some of these are real if from memory, some invented to make the point). But I don't see the value of having separate wiki pages for all these if they are going to be combined in the db (and I would hope that most if not all would be), and I do see significant value in having all this on one page: the user will see all the info together, and there will be far less risk of the various pages getting out of sync, or of extra work needed to maintain them. Having redirect pages from known variations might well be a good idea, then if someone created on in the db the link will work, and take the user to the right place; and if someone tries to create one in the wiki, s/he will be notified of the proper page to record such info. Such a page can always be moved if we determine that a different name is the best "main" or "anchor" or "canonical" name. We can record which names or forms have been verified and how, perhaps with separate sections for each name. How does that sound to you? -DES Talk 23:20, 19 September 2008 (UTC)
The only reason for a separate wiki articles for each verified name is to include them in the verified name categories. The pages can and probably should redirect to sections in canonical article such as Pocket. We just need to stay aware of the keep-it-all-on-one-line restriction you brought up. Marc Kupper (talk) 00:31, 20 September 2008 (UTC)

(unindented)My vote is for Marc's pattern, which more or less mirrors what I've been doing anyways. So if no one has any objections I'll carry on with my cleanup of names, and what a mess some of them are! If this adds anything I'm for Imprint/Publisher(no spaces) where it makes sense.Kraang 01:59, 20 September 2008 (UTC)

Preferences added to the Table (with minor exceptions noted) Kevin 03:57, 20 September 2008 (UTC)
Please be aware that this discussion is only about regularization with respect to the verified names projects. Kraang and Kevin both made comments that seem to be more about regularizing the Publisher field in ISFDB publication records. As far as names with city (before or after goes) - if a publication states, for example, "Paperback Library, Inc New York", then
  • Paperback Library would be the verified name. A glance at the copyright page indicates this is a publisher.
  • On the article for Paperback Library we'd document details such as it's full name "Paperback Library, Inc." street address, etc. A verified name would never have a city, etc. appended to it for disambiguation but rather the Paperback Library article itself would explain the name is used by two difference companies and that explanation can link to either separate disambiguated articles or to sections of the Paperback Library article.
Likewise, there is no need for publisher/imprint or imprint/publisher style names within the scope of verified names as the imprint and publisher are two separate names and their respective articles would explain the relationship. On the database side I'm considering not using the publisher/imprint or imprint/publisher format at all but rather will just use the imprint as I enter/verify publications. The only time I'd use / is for "name / SFBC." If there are specific publishers they need disambiguation, with city, etc. then we should create a list of them. Marc Kupper (talk) 08:26, 21 September 2008 (UTC)

Regularizing "LLC"

Regularizing "LLC" which is a different animal than a corporation. See wikipedia:Limited liability company.

Original name BLongley (Aggressive mode) DES acceptable
Marc acceptable
Kevin acceptable
DES preferred
Marc preferred
Kevin preferred
Name, L.L.C. Name Name, LLC Name
Name, LLC Name Name, LLC Name
Name L.L.C. Name Name, LLC Name
Name LLC Name Name, LLC Name
Name, Limited Liability Company Name Name, LLC Name
Name Limited Liability Company Name Name, LLC Name
Name, Limited Liability Corporation
(this is incorrect usage
and it should be Company)
Name Name, LLC Name
Rather than try and justify a general rule I wouldn't normally apply yet without consensus (as I have no publications of this sort) can I point you both at an example? I'd lean towards "Name" in all cases but do research my Publisher edits, so I left Wordcraft of Oregon, LLC separate from "Wordcraft of Oregon". One is purely SF, the other not: and there's a clear date change involved, which could be useful. (I'm not sure if the books are ever unclear enough on date to need such.) I think there's hundreds of regularizations we could safely do now, but then you need to look at what you have before the next step. BLongley 20:22, 19 September 2008 (UTC)
I agree thast it is normally useful to look at what we have, and that sometimes such a change does coincide with a change in or difference of editorial team, and that is a changte which matters. I also agree that publisher merges should be researched individually. The above is a suggestion, not a rule i intend to follow without discussion. However, I can't imagine that the difference between, say, "Name, LLC" and "Name, L.L.C." would ever be significant in such a way, however. -DES Talk 20:31, 19 September 2008 (UTC)
Agreed - it's just that the current examples make us look like we're proposing some radical updates, and I think we're actually proposing some MINOR ones first, then some case by case examinations, and then these are EXAMPLES of general rules/guidelines to AIM for, but with exceptions. BLongley 21:09, 19 September 2008 (UTC)

Canonical names/articles

I'm thinking of a new tag, {{cn}}, for canonical names with a light blue background, that can be used to flag names that lead towards canonical publishing articles. If a name is both verified and canonical then canonical overrides. The goal would be to highlight links that go to the longer/better articles rather than the stubby verified name things. The canonical names would be normalize/regularized including dropping "Inc." "Ltd" "Pty" and sometimes shortening a name up to the well known version, "Ace" for example.

While I normally don't like color coding lots of stuff (we'd now have colors for links to ISFDB, verified names, canonical names, plus plain links) it seems we should have a way to highlight which links are to the canonical articles. Maybe cn can just bold the link.

The goal is the eventual weeding out of the noise as links/articles in the publishing area would be either be canonical, verified, and a few deviants such as DAW Trade which is an "imprint" that DAW uses when uploading to Amazon and on their internal spreadsheets but not on their public web site nor publications. Marc Kupper (talk) 22:20, 17 September 2008 (UTC)

Consensus Summary?

Could someone who has been participating in this discussion, please summarize the conclusions for those of us who have been passing by, but without the time to keep up with the discussion? (Cut n paste or just, 'we agreed to use table x in section y' statements) - Thanks Kevin 03:11, 20 September 2008 (UTC)

I think I have parsed the regularization tables and understand the proposals there.... I'm still not sure I have grasped what the consensus was on the verified publishers part of the discussion. Kevin 03:17, 20 September 2008 (UTC)
I'd need to re-parse myself to see if there is a consensus or what the significant areas of disagreement seem to be. Marc Kupper (talk) 08:19, 21 September 2008 (UTC)

Warner - example stub "canonical" article

I just added the following for feedback:

The intent of the example is to show how I envision verified names working with canonical articles where we'd have one article for "Warner" that would be about the publishing group, publisher, and logo. Marc Kupper (talk) 08:14, 21 September 2008 (UTC)

I can't see how you look at "A Warner Communications Company" and extract "Warner Communications Company". I'd read the "A" as suggesting that there are multiple "Warner Communications" companies, of which "Warner Books" is one. BLongley 10:12, 21 September 2008 (UTC)
Also, you've extracted "Warner Press", Inc. of Anderson, Indiana - surely this is a secondary source not a primary source for such? I'd prefer to see a verified Warner Press established from a Warner Press book, not a book that specifically is NOT one. An unverified publisher stub would be more appropriate for now I think. BLongley 10:12, 21 September 2008 (UTC)
Re: Warner Communications Company - Good catch. The source publication twice uses "A Warner Communications Company" with an upper case C on Company and the upper case C tripped me up. Yes, the publishing group seems to be "Warner Communications Inc."[3]. I've renamed Warner Communications Company to Warner Communications.
Re: Warner Press - Agreed, it's a secondary source though one that's verifiable in print and confirmed via the company web site. We are already using many secondary references. For example, most publications are a secondary reference for author and artist names and yet we call them "primary references" in ISFDB for establishing author names as the names are verifiable in print.
Well, we're interested in the author and artist names as printed - so the books and magazines we enter ARE the primary references for those uses. The "real" names, or "legal names" are the ones we don't have primary references we can point at: for instance, I can prove that "Lewis Carroll" is an ISFDB author by picking up any of the books, but I can't immediately point at any record here that, if someone found the physical representation, would say that "Charles Lutwidge Dodgson" exists. BLongley 23:42, 21 September 2008 (UTC)
I thought we covered this already. The verified name project is not attempting to figure out or document the "legal" name of a company. To document that we need to get articles of incorporation, dba statements, etc. Verified names is an attempt to document the names and other things about publishing companies as stated in publications and to a less extent other sources. Marc Kupper (talk) 06:23, 22 September 2008 (UTC)
I understand that, I'm just pointing out why I believe Primary Verification via a book with that name in IS Primary for author names, not secondary. BLongley 18:11, 22 September 2008 (UTC)
You do have a good point though. Should the verified names project only flag names as "verified" if, and only if, it comes from a primary source for that name? It probably also means that publishing group names listed on a copyright page would be "secondary."
I really don't think we should create a verified name from a source that states it's totally unrelated. One that states it IS related is a bit safer - particularly if it comes from the copyright page where legal formalities have to be observed. BLongley 23:42, 21 September 2008 (UTC)
Legal formalities? LOL! There are none, at least in the USA. The book I happened to pick for this example had to have a piece of strangeness with the statement "That other company is not us!" I agree with you though that normally it would not be a verified name. I only flagged it as verified after finding the company still existed and they clearly referenced the same name/address. Had that not been the case I would not have marked that name as verified. Marc Kupper (talk) 06:15, 22 September 2008 (UTC)
Surely there's enough legal requirements to avoid passing off as a different company, for instance? I couldn't publish a book with Baen's details on the copyright page? I can understand if there's no particular ramifications for getting a Library of Congress number wrong for instance, or even for falsely claiming a first edition, but whereas I could write a story that says Baen were taken over in 2008 by Longley publishing, I doubt I could get away with that on the copyright page? BLongley 18:11, 22 September 2008 (UTC)
My thinking though was let's say a book has a secondary credit for Faber and Faber, LTD as being the original publisher of a work. Later someone working on Faber in ISFDB finds that the primary sources are using "Faber & Faber." In that case the person can remove the "verified name" designation for "Faber and Faber" and will add a note to the Faber article that while the company has been credited at times as "Faber and Faber" that the company itself consistently uses "Faber & Faber". That note will be able to cite both a primary source and to give an example of a secondary source that used "Faber and Faber."
I'd prefer that secondary-sourced names don't get started as verified names and demoted later - people might start doing that to verified pubs they don't agree with. Just create an ordinary publisher stub entry and promote to verified name when we have a verification source from that name. BLongley 23:42, 21 September 2008 (UTC)
That seems reasonable. Should we have a "secondary sourced" header to make it clear that an article is based on secondary source(s) and to give instructions on how to convert it to a primary sourced article? Marc Kupper (talk) 06:28, 22 September 2008 (UTC)
Try it. I've just added this, feel free to experiment with that in any way you want, it's not verifying anything yet. 19:18, 22 September 2008 (UTC)
In the mean time I believe it's less overhead to just verify that a name appeared and ideally whoever set up that source publication reference will also include enough context so that others can determine if the reference would be considered primary or secondary for that name. Marc Kupper (talk) 21:58, 21 September 2008 (UTC)
I would think that a "secondary" source such as "About Warner Press" or a published article or reference work would be a better source than any particular publication. But this kind of "This is us, not them" notice is not the best evidence for "them", but I see no reason why it won't do until better comes along. -DES Talk 22:51, 21 September 2008 (UTC)
I'm finding that company web sites are only documenting the way things are now and that Wikipedia tends to be based on company web sites. It's a rare company that has a historical section and even those tend to not be accurate when it comes to the names as they will refer to things by their internal shorthand name rather than the name(s) used on publications. Some of the histories available are based on human memory rather than reviewing and documenting historical documents. While these are often fascinating reads they usually fall short on accuracy. Marc Kupper (talk) 06:52, 22 September 2008 (UTC)
From looking at lots of copyright pages a bit more closely than usual since we started this, it seems that I'd get more of what I want out of this by looking at registered trademarks. Those seem freely accessible in the UK, Company information doesn't seem always so easily accessible for free. Internal company designations like "division" and overseas subsidiaries even less so. I suspect that for before the 1950s we'd have to rely on publication data exclusively, or references such as you mention, and trademarks that are now totally out of use sometimes don't seem to have had the official data copied to the web. But I could put up a lot of verification for imprint names I've been using automatically before the call for verification sources came along, and the current activity doesn't seem to be driving out any IMPRINT names, just companies, groups, divisions, etc. BLongley 23:42, 21 September 2008 (UTC)
I've also been taking some shortcuts to support things that I suspect other people want - e.g. when adding a dated address for a publisher I've started linking to the pub I found the address in. I HAVEN'T created a full impression of the copyright page of that pub to link to though, just said which pub you can find such an address in. Does this annoy people? Do people even care about addresses that much? Am I recording addresses at the right level (imprint, publisher, group...)? BLongley 23:42, 21 September 2008 (UTC)
Only documenting the names is fine with me and for some source-references I only captured sentences/statements that contain the names we are interested in. If I cut it down to just the names though then Bill would not have been able to catch that I captured Warner Communications wrong unless he has a Warner book.
I'll probably continue to capture the bottom of the title page as accurately as possible. As they usually only have data that's directly relevant to the project. One annoyance is that my scanner has a frame or bezel around the glass meaning it takes some finessing to scan the inside of a book to get the logos. I could switch to digital camera images but then I loose the scaling. With the scanner I can document the DPI setting so that others can figure out the true size of the scanned image. Marc Kupper (talk) 06:42, 22 September 2008 (UTC)
Bill, I looked over the existing source-pubs you have entered and they seem to be fine. One suggestion is to forward reference from the source-pub articles to the names using {{vn}} and {{pn}}. This would help catch things like THBSNSTR1C1976 which states "Cox and Wyman" while the publishing name article is Cox & Wyman. I suspect the "&" version is correct meaning that either the publication used the name in error or the company uses both forms of the name. Cox & Wyman seems to be part of the CPI book manufacturing group now[4]. It's interesting how CPI does not say they started and are headquartered in France. Marc Kupper (talk) 07:27, 22 September 2008 (UTC)
That turned out to be an error on my part. I guess I was more excited about the publisher change of address. BLongley 18:20, 22 September 2008 (UTC)
Hmmmm.... I can see some problems with linking forwards. E.g. here the words "Ballantine" and "Books" are on separate lines. It looks as though showing the exact format and linking are going to be incompatible at times. And here there's no "regularised case" name to link from, although that of course could be added as a separate line. BLongley 18:45, 22 September 2008 (UTC)
During the regularization discussion I was thinking of adding a section to the top of source-pub articles that said "The following names appeared on the copyright page that were regularized for use as a verified name (link this to the page that'll document how to regularize names): POPULAR LIBRARY Popular Library, etc." That'll also allow you to deal with wrapped names. Marc Kupper (talk) 22:01, 22 September 2008 (UTC)
A second solution comes to mind if Al was available which would be to add "if" support to templates. This is a MediaWiki add-on and we could then add an optional second parameter to vn so that someone could use {{vn|Popular Library|POPULAR LIBRARY}} which would display as POPULAR LIBRARY while linking to the Poplar Library article. This could also handle names split over two lines as you can have {{vn|Popular Library|POPULAR}} on the first line and {{vn|Popular Library|LIBRARY}} on the second with both of them linking to the Popular Library article. Marc Kupper (talk) 23:05, 22 September 2008 (UTC)
You don't need IF support for that. Look at the way Template:P deals with its "name" parameter. -DES Talk 22:21, 23 September 2008 (UTC)
Another thing I can't do, it seems: {{vn|{{pn|Arrow}} Books}} to show the imprint I like, AND the verified name you've established. Can these templates be made to nest? BLongley 18:59, 22 September 2008 (UTC)
I would argue that "Arrow Books" appearing on a page is one name and not one name nested inside another. If a publisher never references "Arrow" as in "First Arrow edition: May 1973" then you would not have a way to link to the Arrow article other than by adding a note at the top of the page saying that the publisher never uses "Arrow" but here's a cool article... Note that if Arrow is a canonical article then the Arrow Books article will be redirecting to Arrow. Marc Kupper (talk) 22:37, 22 September 2008 (UTC)
Well, I'd say that your extracting "Arrow Books" as a verified name from "Arrow Books Ltd" appearing on a page is already an example of one name nested inside another, it's just that you've already dismissed the company suffix as something that can be discarded. I find that "Books" can mostly be disregarded too when finding the useful (ISFDB database field) imprint. Nested templates might help, particularly if somebody wanted to reintroduce a link to the company based on the same words. Although I presume if two nested links are too hard, three would be worse, and what Thomas conneely enters would be nigh impossible. ;-) BLongley 20:09, 23 September 2008 (UTC)
From a technical perspective what you want to do is rather hard and it's also not clear just what such a nested construction is supposed to do. Would you want to see "Arrow Books" where the first part links to Arrow and the second to Arrow Books??? Marc Kupper (talk) 22:37, 22 September 2008 (UTC)
That would do. It makes it rather clearer that people are extrapolating different conclusions from the same data. It looks a moot point though, if you're suggesting dropping vn.BLongley 20:09, 23 September 2008 (UTC)
This does bring up that if we don't "verify" names from secondary sources then we'd be link to the articles with {{pn}}. If later someone verifies the name via a primary source they would need to edit the pages that reference the article and to change the all the pn's to vn's. If Al was available then we could add robot support and a robot that would sweep and change references to verified articles to use vn and to change vn references to non-verified articles to use pn. Note that at present neither vn or pn "do" anything. Both just make it easy to reference articles in the Publisher namespace with vn also highlighting the link and adding the hover-text. Marc Kupper (talk) 22:37, 22 September 2008 (UTC)
If we want a semi-automated way to convert uses of {{pn}} or direct wikilinks to uses of {{vn}} when the destination page is in a verified category, this can be set up on a semi-automated basis using AutoWikiBrowser. whether we want that or not is another question, but the conversion can be automated enough that it shouldn't be an argument one way or another.

Drop/keep vn?

I'd like to propose dropping {{vn}} and instead we can just use {{pn}} when desired to link to names in the Publisher namespace.

  • Color coding links is a non-standard practice plus design/style guides advise limiting the number of fonts/sizes/colors used on a page.
  • If we adopt a practice of using vn to link to names identified as "verified" from a primary source reference and pn for names either not verified or verified from a secondary source then we have a bookkeeping hassle.

Both vn and pn are shortcuts with {{pn|Some Name}} expanding to [[Publisher:Some Name|Some Name]] and vn doing the same expansion but also adding the color coding and the "Verified Publishing Name" hover-text.

I could see resurrecting vn in the future if we get a robot set up that would automatically take care of using vn to link to pages in one of the verified name categories and using either pn or direct [[]] wiki links to other pages.

I'll keep voting open a week. Marc Kupper (talk) 19:38, 23 September 2008 (UTC)

Vote to drop vn:

Vote to keep vn

  • I'll vote to keep it for now. I don't want to see it disappear after a week or two of limited trials. There is some good to be had from this experiment and I don't want to see the positive points lost - I'm not sure if "vn" IS one but it's too early to say for sure if it could be useful in some cases. BLongley 22:24, 23 September 2008 (UTC)
  • Vote changed to keep. Marc Kupper (talk) 21:46, 24 September 2008 (UTC)

Neutral / comments on keeping or dropping vn:

  • I suspect that at some point I will get really annoyed at pointless recording again, multiple conclusions drawn from the same data, and all sorts of wiki-rules being imposed or at least "strongly suggested", and will demand more attention to what goes in the DATABASE. But I've been working on publisher pages in the wiki for months now, not weeks: I haven't withdrawn any of my edits/experiments yet and want them discussed rather than removed as a failed experiment. If Voting is the way to go, then maybe the templates for publisher could include a suggestion for a simple Yes/No vote on whether a particular page is useful or not, and if not, what should we do with the data? Obvious candidates are the pages we automatically get links to (from the database publisher field itself) - that could immediately suggest a problem with the database entries. Groupings/linkings of such pages are little used. I don't even know what people actually want on such - I suspect it's somewhere between what I've done and what Marc's project is creating - but more examples, and more participation would be a good start, IMO. BLongley 22:24, 23 September 2008 (UTC)
  • If we want a semi-automated way to convert uses of {{pn}} or direct wikilinks to uses of {{vn}} when the destination page is in a verified category, this can be set up on a semi-automated basis using AutoWikiBrowser. whether we want that or not is another question, but the conversion can be automated enough that it shouldn't be an argument one way or another. -DES Talk 23:20, 23 September 2008 (UTC)
  • I should add that if we want to make the distinction, the vn result could include the words "Verified name" or whatever other wording we want as part of the link text, instead of just a color code. -DES Talk 23:22, 23 September 2008 (UTC)
vn already includes the words "Verified Name." Hover the mouse over Arrow Books with either FireFox or IE. Marc Kupper (talk) 00:40, 24 September 2008 (UTC)
So it does. It could, if we wished, include those words as part of the4 displayed text, not just as hover text. Or not, but that would help make the color redundant rather than essential. -DES Talk 17:21, 24 September 2008 (UTC)
I got the impression from people that some of the, at least initial, objection to the verified names project was it's intrusiveness. Maybe it was just that page header that said "don't edit this page under pain of death (or citing references)" that bothered people. Anyway, I tried to make vn as unobtrusive as possible but agree that it could easily add displayed text rather than using color coding and hover text. If we had whatever software makes <ref> then vn could be near invisible except for adding a thing to section that would work like <references/>. Marc Kupper (talk) 21:40, 24 September 2008 (UTC)
  • This morning I woke up and realized I could add another template. :-) Initially it was to be vn2 but I'm now leaning towards {{vnn}} for Verified Name, Normalized. It'll be just like vn except it takes two arguments and will allow for {{vnn|Pocket Books|POCKET BOOKS}} and would be used to convert an abnormal/irregular name into it's normalized or regularized form. Marc Kupper (talk) 21:46, 24 September 2008 (UTC)

Publishing Names - coding

The other morning I woke up thinking about much of what I was doing with respect to verified names was mechanical. The source publication records are being used to define and document roles, relationships, and addresses using one of:

  1. Name is a Role with role being one of publisher, printer, publishing group, etc.
  2. Name1 is a/an relationship of Name2 with relationship being one of imprint, division, affiliate, subsidiary, etc.
  3. Name is located at Address.
  4. printing # by Name printing on date. This will generally be used on later editions and printings to document earlier editions or printings.
  5. Name1 is now Name2 on Date. This would be used to note that at organization's or imprint's name changed. Date is optional.
  • Any of the first three items can be flagged as "secondary" if they are not referring to the same company as the publisher of the publication. The thinking here is that a publisher writing about another company, such as who manufactured or printed the book, may not have accurate information about the other company's name or address. Other than for a first printing, the 4th item is always secondary and is also a special case that allows for Name to be an imprint, publisher, or common name.
  • Any of the first three items listed above can also include "on date" as in "Name was a Role on date." These citations would be considered secondary as they are defining something that occurred prior to printing the publication and we don't know how accurate that citation is as far as the name or date goes. The most frequent use will be to document previous publication of a title by either the same or another publisher.
  • Any Name can get flagged to indicate that it's not the name stated/used in the source publication. This will come up most often when disambiguating a name like "Transworld Publishers" into one of "Transworld Publishers (Autralia)", "Transworld Publishers (South Africa)", etc. Ideally, however this flagging gets done we should know the stated name and then the ISFDB assigned name.

Once we have the roles, relationships, and addresses defined then what I'm then generating in the wiki-articles seems like a mechanical exercise. To see if this is correct I'll write an application where I take the source data and to see if it can generate the wikitext for the articles. If that's sustainable than phase 2 would be to either add a data-block to the source publication articles or to implement the data blocks directly in ISFDB as a "Publishing Names" part of publication records.

Does this seem reasonable or am I missing something? Marc Kupper (talk) 17:31, 28 September 2008 (UTC)

Well, Name can sometimes have SEVERAL roles, or switches roles over time. E.g. Panther Books Ltd being an independent Publisher, then a wholly-owned subsidiary of Granada, then just an imprint - shared around the Triad consortium at times - before having the name sold on to Harvill. And I think Collins were their own printers at times. And we might want to track an entity despite a name change: e.g. Methuen Children's Books was sold by Methuen to Egmont in 1998. I don't think it retained the name. BLongley 18:43, 28 September 2008 (UTC)
I'm thinking Name and role as it appears to be defined in a specific publication meaning at a specific instant in time. I should italicize appears as I know I've mis-read the role a number of times in the short time we've been trying this experiment but then that's another advantage of automating generation of the wikitext as we only need to correct the source articles.
It stands to reason that long ago (100 or more years) people were printers of their own stuff (vanity publishing) who later moved into what we know of as publishing where someone reviews, selects, arranges to print (or prints), and markets a publication with imprints and publishing groups being a natural evolution.
"And we might want to track an entity despite a name change:" Thank you, I added #5 to the list above. Marc Kupper (talk) 20:56, 28 September 2008 (UTC)
"Address" can sometimes be a bit vague too. Some books I've looked at don't make it clear whether that's the location for the imprint, the parent company/division, or publishing group. Or they list LOTS of addresses - which were probably all valid, but might not have anything to do with the publication itself: e.g. if a book mentions Random House UK, Random House Australia, Random House New Zealand, but doesn't have prices for all three, I suspect that there's just some boiler-plate being used and if it WAS published in the other countries it might have been in different formats/with different ISBNs/etc. BLongley 18:43, 28 September 2008 (UTC)
So far I have not seen a publication where it was impossible to tell if the address was for a division or the publisher but agree that this could easily exist. I suspect that if an address is vague that for documentation purposes we can hit both bases and and have an address entry for the imprint and a second for the publisher for example both pointing at the same address. Names like "Random House Australia" seem to be just names with no address though presumably they are located in Australia or have something to do with Australia. Marc Kupper (talk) 21:06, 28 September 2008 (UTC)
Well, Publication:PCLPSSTJQH1998 and Publication:SNFFFCTNLV1999 have addresses, and might indicate Transworld premises being acquired. Or might be existing Random House addresses. BLongley 21:17, 28 September 2008 (UTC)
The addresses seem unambiguous and it's a matter of if you want to list all the addresses under Transworld Publishers or if we should disambiguate them to Transworld Publishers (country or region) while also making it clear that a name like Penguin Group (USA) is the actual name used by the publisher while Transworld Publishers (Australia) is one we made up to disambiguate Transworld Publishers. The address for Transworld Publishers (Australia) for example is:
Transworld Publishers
c/o Random House Australia Pty Ltd
20 Alfred Street, Milsons Point, NSW 2061
It does bring up on if the Name fields should also have a flag indicating if this is a stated name (nearly always the case) or one that we manufactured to help disambiguate. Without the flag it will be human nature to clean up the mess by appending something to the name or other wise changing it from what's stated. One solution is sort of like how {{vnn}} works where we give it both the stated and normalized names meaning that the use of vnn is a flag that indicates that there was a need to translate a name from what was stated. I suspect coding this should not be too bad as we can put the flag in the target record and anyone using that name in a source-publication would see the flag is enabled by default and can turn it off if the publisher should start using "Transworld Publishers (Australia)" for example. Marc Kupper (talk) 22:12, 28 September 2008 (UTC)
Still, I've thought for a long time that ISFDB itself could have a publisher hierarchy - relationships should have a start date and an end date so you can move up the hierarchy in the appropriate way for the time though. E.g. Orbit goes up to Futura in the earliest days, to Little, Brown & Co later, Hachette Livre currently. BLongley 18:43, 28 September 2008 (UTC)
Yep, that's exactly why verified publisher name project exists. The first effort here in the wiki is more about prototyping how the data should be collected and what needs to be collected so that we can then show the hierarchy and relationships over time. Marc Kupper (talk) 21:14, 28 September 2008 (UTC)