Talk:ISFDB Downloads

From ISFDB

Revision as of 07:36, 8 May 2011 by Ahasuerus (Talk | contribs)
Jump to: navigation, search

Thanks for making this page actually useful. Those tube upgrades are awesome! Alvonruff 18:48, 12 Sep 2006 (CDT)

Nothing like new vacuum tubes to keep them bibliographic juices flowing! :) Ahasuerus 13:22, 14 Sep 2006 (CDT)

Contents

Python and CGI

On Windows 2003 I have MySql, Apache, PHP, and MediaWiki running. I've also installed Python but now have some questions. 1. Where do files like www.isfdb.org/cgi-bin/ea.cgi come from? I'm assuming it's a small CGI that calls ea.py.

A Makefile changes ea.py into ea.cgi. It makes two small modifications to the program: It changes the extension (from py to cgi, as many sites will only execute CGIs with that extension) and changes the #!_PYTHONLOC line with the actual location of the python interpreter.

2. is there an index.html or how are you mapping from the root to cgi-bin/index.cgi?

There is a short redirect page for people who still reference index.html:
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
 <head>
   <title>Internet Speculative Fiction DataBase</title>
 </head>
 <body bgcolor="#ffffff" link="#357bf0" text="#000000" vlink="#309372">
 <script type="text/javascript">
         location.replace("http://www.isfdb.org/cgi-bin/index.cgi");
 </script>
 <center>
 If you see this page, click [<a href="http://www.isfdb.org/cgi-bin/index.cgi">here</a>].
 </center>
 </body>
 </html>
 

3. there are references to mediawiki tables such as mw_cur, mw_user, mw_user_rights. I installed the current version of MediaWiki which does not have these tables. It's not clear if the reason I don't have them is because of a MediaWiki upgrades or if ISFDB has triggers or some other mechanism to copy from the MediaWiki database into tables in the isfdb database. While mw_user seems similar to my current user table (the field names/contents seem to match) tables such as mw_cur and mw_user_rights don't exist at all or have been renamed to something very different. Marc Kupper (talk) 20:14, 10 Jun 2007 (CDT)

There are differences in the tables between the old MediaWiki and the new, and I haven't looked at what porting changes would be needed to support those yet. Alvonruff 07:07, 11 Jun 2007 (CDT)
Thank you Al - I've got the main page (index.cgi) displayed. One thing I see is that many URLs, such as the Magazine list, are hard coded to http://www.isfdb.org/ or related URLs and should go in localdefs.py or an external config file. I can't display publications at the moment until I figure out how to deal with the new MediaWiki table layout. Marc Kupper (talk) 03:25, 12 Jun 2007 (CDT)

ISFDB installation instructions

This thread started on User talk:WimLewis#ISFDB installation instructions but I believe it's better here. Marc Kupper (talk) 22:52, 13 Jun 2007 (CDT)

One thing that comes to mind is that you have been able to get the ISFDB scripts to run at home, so I was wondering if you could update the ISFDB installation instructions that I started a while back while the experience is still fresh in your memory? Thanks! Ahasuerus 02:00, 15 May 2007 (CDT)

Seconded! I still wonder what all the warnings my plain installation of the DB gave me mean, but as I've not found a major problem with that yet I'm tempted to move to the next stage. (Umpteenth stage is of course to show how much better it would be if we switched to whatever technology my company are selling at the moment, but I'll settle for learning something else that WORKS.) BLongley 17:35, 16 May 2007 (CDT)
The first time I ran the install script I started from the top of the page where it says "Load into MySQL by starting up the mysql client (mysql) and "source"'ing the backup." As ISFDB was not selected it must have installed right on the ibdata1 database as it hosed out MySql badly and it would catch error 11 during startup. I had to uninstall and delete all of the MySQL files before a reinstall gave me a working MySQL. I then read further down in the directions where it has "create database isfdb; connect ISFDB; source backup;" and that worked without error.
Unfortunately, I accidentally deleted the notes I took about the installation of MySQL. From what I recall, I accepted the defaults on pretty much everything except the character encoding where I opted for UTF-8 to allow for Unicode / foreign language support. I'm still struggling with getting the full CGI to run with the main issue being that the MediaWiki tables have changed. I could not find ISFDB's version of MediaWiki for download. Al, could you make a gzip of the entire database, including the MediaWiki stuff available? You can just zap out the e-mails (if there are any) and the password hashes from mw_users. I'd like to compare the old tables against the current latest MediaWiki to see if upgrading is practical. Marc Kupper (talk) 22:52, 13 Jun 2007 (CDT)

BLongley's Thread

This thread originated on ISFDB:Moderator_noticeboard#Extremely long user talk pages which drifted into installing ISFDB. I've copied the part that drifts here as others that want to install ISFDB may be interested. Marc Kupper (talk) 21:07, 16 Jun 2007 (CDT)

I looked at some of the code this weekend during the downtime. I got nowhere near suggesting improvements, but now have a GUI front end working on the database at last (TOAD for MySQL) and Python stuff now opens in a Python editor. (For anyone trying this still, the sticking points were having to configure the MyODBC datasource after downloading, and copying the "my" config to the Windows home directory.) BLongley 17:47, 11 Jun 2007 (CDT)

(unindent)While struggling with installing the ISFDB locally, it would be very beneficial if we could update ISFDB Downloads with any additional hints and directions that we can come up with. That way the next suffe^H^H^H^H^H editor will hopefully have an easier time of it :) Ahasuerus 10:25, 12 Jun 2007 (CDT)

Will do, as and when I find out what went wrong. :-/ I think 'accepting all defaults' was a mistake on my Windows system, but I don't yet know what language support options would have worked better yet. It's compounded by the rotten MySQL Command-line client which has even LESS support for funny characters, which is why I went looking for a GUI front-end. I can write that up if people like, but there's not a section for it yet and do we really want a dozen different favourite front-ends documented (e.g. I got HeidiSQL working too, but wouldn't recommend it) if the aim is to get a working local ISFDB instead? It looks as if I should either explore Cygwin tools if I stick with Windows, or turn a spare old PC into a proper Linux Apache server, I can always get those working far more easily than Windows stuff. BLongley 13:23, 12 Jun 2007 (CDT)
Bill - I'm with you on that the mysql command line client does not seem to deal with "funny characters" by default. So far I've shrugged and ignored it. Maybe what's needed is a FAQ of sorts for the development side of ISFDB so that as someone researches this issue we can save the answer in the FAQ as it will come up for others. I believe I understand the problem in that MySql is dumping out raw UTF-8 and the mysql client is a console application which means it expects a "dark ages" data stream and fiddling with a nastiness called "code pages." Marc Kupper (talk) 21:07, 16 Jun 2007 (CDT)
Well, it's raw Latin1 in this case, but yes, the console application is probably confused by anything more than 7-bit ASCII characters. I'm not going to try and sort that out (there's other nastiness like a lack of selectable copy and paste options that make it unpleasant) so getting a GUI working was my priority. I got HeidiSQL working first but the longer text fields just show up as "(MEMO)" which is even LESS informative, so I went back and researched TOAD a bit more thoroughly and got it working. The important bits as I remember them:
  1. Download mysql-connector-odbc-3.51.15-win32.zip as well as the main MySQL stuff. Unzip and install that. Go to Control Panel, Administrative Tools, Data Sources (ODBC), and setup a connector to the ISFDB database on server localhost.
  2. Copy the "my" config file you created during MySQL installation from "\Program Files\MySQL\MySQL Server 5.0" (or wherever you installed MySQL) to your "\WINDOWS" folder.
This seems to allow the GUI tool of your choice to find the ISFDB database, although I don't really understand why, as the user-id/password I used for the ODBC connection doesn't have enough permissions to actually read data after connection via TOAD... weird. :-/
Anyway, the tools I got working were TOAD (ToadforMySQLFreeware_203.zip) and HeidiSQl (HeidiSQL_3.0_Setup.exe) although I'm still not that impressed with either and may try and get the data into an Oracle database where I can unleash my fiendish PL/SQL skills on it... but of course, that is getting totally away from the "local ISFDB" installation idea and more towards the "data cleanup investigation" tasks. BLongley 07:28, 17 Jun 2007 (CDT)
Due to recent queries about whether it's possible to get Excel to recognise database changes (and maybe vice versa, I'm not sure) I can confirm that if you've got the local MySQL database up and running on a Windows system, it's actually very easy to get the ISFDB data into Excel, so feel free to ask me for advice if you can't do that yet. If you want to get SPECIFIC data back about a Title, or even a single Pub, based on something you already have in your spreadsheet/workbook - sorry, don't know that much. BLongley 15:33, 17 Jul 2007 (CDT)
But I can assure you that it's perfectly possible to get the ISFDB data into Excel, and if you're a whiz with Excel then maybe you can help with a "spotting updates" effort. I can do a little with ISFDB data in one workbook and MY data in another, and try some matching. It's not my favourite pastime, but if people are trying to do useful things via Microsoft tools, I'm happy to help. If only so I can gloat about how much better it would be with other tools. ;-) BLongley 15:33, 17 Jul 2007 (CDT)

Make and makefiles for Windows

At this point I have pretty much everything working on my system except I can't run "make". I can get make from gnu

There's a problem in that the existing myfiles assume a Unix environment. I thought about Cygwin but don't want too much extra baggage. In the mean time I did the "make" manually to build my CGI files. I'm sorry to leave the thread hanging at the moment but I'm pretty pressed for time this month. Marc Kupper (talk) 21:07, 16 Jun 2007 (CDT)

One file looks suspiciously small?

19-Jan-2009 is about a fifth the size it should be? BLongley 23:05, 10 February 2009 (UTC)

There are occasional network issues that prevent the backup files from getting properly copied to isfdb2.org. Sometimes the backup file will be missing and sometimes we get a truncated version. Ahasuerus 00:19, 11 February 2009 (UTC)

CVS secret

OK, what is the secret to getting the code via CVS (or whatever revision control system is now in use)? The SourceForge projects seems to have current bug tracking and there is a section for code referencing CVS but even the link to browsing via web seems to fail and though I am now supposed to be a developer I cannot get it to work via ssh or anonymous pserver. Is SF borked or what? Uzume 02:09, 1 February 2011 (UTC)

SourceForge was hacked a few days ago and they disabled CVS until they are sure they have resolved the issues. Some details can be found on their blog. I do not believe they have restored service yet. --JLaTondre 02:22, 1 February 2011 (UTC)
Thanks for speedy update. I knew they had been hacked but I was not aware it was directly related to and affected CVS (though I am not really surprised). I notice as part of their restoration effort they are saying they are planning to look carefully as discontinuing CVS service in favor of Subversion. Is there anyway to get an fairly recent snapshot of the code until they restore things? The link on the website mentions http://www.isfdb.org/isfdb2.tar.gz but that is from 2008—far from recent. Thanks. Uzume 03:55, 1 February 2011 (UTC)
I keep backups of the ISFDB software as it exists on the live server in a secure undisclosed location, but they can't be released to the public since they are not sanitized and contain passwords etc. Unfortunately, I won't have access to my development server until Thursday and by then Sourceforge should be back up. Hopefully... Ahasuerus 06:29, 1 February 2011 (UTC)
I understand—thanks for trying. Do you have any thoughts on converting/moving to a different revision control system? I personally feel SourceForge is sort of slow and would prefer something like git and github but even doing a SourceForge upgrade from CVS to SVN might be worthwhile. Uzume 13:27, 1 February 2011 (UTC)
The main advantage that SVN has over CVS was brought into focus a few weeks ago when I needed to revert a whole bunch of changes in our CVS repository and it was a much bigger pain than it had to be. In SVN it would have taken a few clicks. I am not familiar with git*, but we can ask Al, who is much savvier about all things software. Ahasuerus 06:18, 2 February 2011 (UTC)
Both SVN and git (and several others) have atomic changesets vs. grouped file revisions (e.g., CVS). git also is a distributed revision control system (some others are hg and bzr, etc.) so people can have their own repositories and code can be merged about from one repository to another. One can have code development networks that are treed or other wild network structures—it is just more naturally flexible since it is not based on a centralized structure. Uzume 18:44, 4 February 2011 (UTC)

Well, it is considerably past Thursday and I do not see CVS@SourceForge back up. Is there any chance of getting a sanitized copy of the backups when you get to it? Thanks. Uzume 15:41, 7 February 2011 (UTC)

SourceForge now says that CVS should be back up by the end of the week. I'll see what I can do in the meantime... Ahasuerus 03:35, 8 February 2011 (UTC)
Thanks—that is much appreciated. Uzume 16:51, 9 February 2011 (UTC)
I think the universe is conspiring to keep the source code away from you. The Internet connection used by my development server was "upgraded" to 20Mbps earlier roday, at which point it promptly went down and has stayed down for the last 12+ hours :( Ahasuerus 19:04, 10 February 2011 (UTC)
Well I appreciate the effort anyway. Uzume 05:41, 11 February 2011 (UTC)
The development server is back in business, but in the meantime CVS has been restored! ;-) Ahasuerus 15:13, 11 February 2011 (UTC)
Yes, I can confirm that I have now updated my local copy. It took ages as I'd forgotten so much about how I did it before, and it needed the "-B" flag in the end. I suppose I ought to check whether there's any small developments within my limited Python abilities, just to get back in practice. BLongley 19:37, 11 February 2011 (UTC)
Excellent! Yes, I finally managed to check out the CVS code (using ssh). Thanks again. Uzume 00:25, 12 February 2011 (UTC)
Great news! Have you managed to build a working local copy yet? I think that's the next step, or you can't test changes. BLongley 01:26, 12 February 2011 (UTC)
I am working on it. Unfortunately, I do not have as much time to work on it as I had in the last few weeks when fate seemed to conspire against me getting the code. Uzume 06:14, 12 February 2011 (UTC)

Local Users

OK! I successfully built a working local copy. Well most of it works anyway. I noticed the DB dumps do not include the schema for the isfdb.license_keys table and edit/keygen.cgi crashes. I understand why the data is not included (security issues) but perhaps we can get the empty table schema included in future dumps? (Yes, I found the schema here: isfdb.sql but it is a little cumbersome to hack in) I also thought it was interesting that though the user database is cleared, there are still piles of tags in the database referring to absent users (except the users I have locally recreated of course). Since the tags are pretty much public data this is not really a security issue but it is not useful and sort of confusing/disconcerting (votes are removed but tags aren't?). Some of this some of you probably already know but since I am going through such for the first time I thought I would provide comments since it it has probably been a while since you have done a fresh install, etc. Anyway, now I can begin to work on some bug fixes and/or perhaps even some feature requests, etc. I shan't have time right away but at least I can start on such. Uzume 05:47, 14 February 2011 (UTC)

Good point. I never spotted the missing table before, but then I test API submissions against the live database. Tags might be useful to have properly implemented locally as there is (or should be) a FR to move them when variants are created. BLongley 17:38, 14 February 2011 (UTC)
That would be Bug report 3165570 -- User-defined tags are not moved when VT is created. BTW, don't forget to check the Development page, which lists all scripts that are currently in flux. This will help avoid collisions. Ahasuerus 01:04, 15 February 2011 (UTC)
Bill: My point wasn't that tags did not work correctly locally—they do (except as detailed in listed bugs). The point was they are not purged from database dumps with all the other user information such as logins, votes, etc. So even in my "working local copy" I can see your tags (if I can ever create a user with the same user id you have on the real live server/database). These are effectively dangling tags as they to not correctly link to users in the local copy (they still correctly link to titles, however). Uzume 02:29, 15 February 2011 (UTC)
Well, I'd argue that "Users of the Tag xxxx" giving a result of "UNKNOWN (112) , UNKNOWN (10) , UNKNOWN (2) , UNKNOWN (2) , UNKNOWN (2)", etc, isn't ideal. Or "Verified by UNKNOWN on 2007-05-01 16:15:29". So I guess it would be better to have dummy users rather than just delete the users. I don't think I'd want to delete things like Tags just because the user entry isn't present - I personally use the Verifications table a lot for instance. BLongley 20:55, 15 February 2011 (UTC)
That is a very good point. And it isn't like verifications, votes, and tags are private data. Perhaps user data could be left in too as long as the authentication data is somehow sanitized (like changing everyone's password to a fix value that could not be entered via the web interface). This way the user data would not be gone in local copies but no one could use any of the accounts without direct access to the local database (meaning the local owner of the data copy). This could also mean you wouldn't have to make a new account in your local working copy. You would just have to re-enable authentication for your account (and make it mod, etc. for local use and testing, etc.). Perhaps a better dump script with such features could be developed. Uzume 05:50, 16 February 2011 (UTC)
I don't know what the dump script is like, but it's simple to recreate distinct Users to allow the local copy to work a little better:
INSERT INTO mw_user 
select user_id, CONCAT("USR",format(user_id,0)),'X','X','X','X','X','X','X',0,0,0,0,0,0
FROM ( 
	Select DISTINCT user_id FROM (
		SELECT DISTINCT user_id from tag_mapping
		UNION
		SELECT DISTINCT user_id from verification
	) a
) b
Of course, having the downloadable backup file work better out of the box would be better than us having to add fix scripts to installation instructions. BLongley 21:12, 16 February 2011 (UTC)
Actually, I'm slightly disappointed: over thirteen thousand users and only 260-odd have bothered to tag or verify something. :-( BLongley 21:18, 16 February 2011 (UTC)
Yes, but how many of them were bots? Ahasuerus 22:15, 16 February 2011 (UTC)
The vast majority it seems. We must be really user-vicious or bot-easy. BLongley 02:20, 17 February 2011 (UTC)
I've just wasted three hours trying to get an improvement to "Top Verifiers" to work before realising that the current script doesn't work without a fix like above. Time to add it to the instructions, maybe? BLongley 16:52, 7 May 2011 (UTC)
Here's the problem we have:

Image:Verifiers.jpgImage:Verifers 1.jpg.

It's awkward to test stuff involving users. :-( BLongley 20:58, 7 May 2011 (UTC)
I have installed the new version of the script -- looks good! BTW, both the old version and the new version error out of there are no verifications on file. A minor thing and unlikely to affect anything, but we may want to fix it at some point. Ahasuerus 07:36, 8 May 2011 (UTC)

And on that note, just what exactly does the dump script look like? Can that be published so we can perhaps help develop it better to include and or remove content better? Obviously anything would be submitted and subject to scrutiny before being applied, etc. The best way would just be to check it into the repository. Having it would also be useful for exposing certain bugs, etc. too in case I want to sent someone else a snapshot of my database (as I too do not want to sent my local user passwords, etc. too). Thanks. Uzume 00:33, 17 February 2011 (UTC)

That's a good point. I developed the dump script some years ago, before we had CVS, and it never occurred to me to check it in. I'll need to clean it up first since it also populates Fixer's database and performs other arcane and unholy activities. Ahasuerus 00:51, 17 February 2011 (UTC)
Obviously not something we can test with live data, but we could review the code. I've developed data-obfuscation scripts for a few big companies and am willing to help with such. BLongley 02:20, 17 February 2011 (UTC)
Actually it would not be that hard to test with live data, however if you mean "the" real main ISFDB then yes, I agree (for one not many of us have direct access to all of that). But I can easily populate a number of users on my local copy to test it out with for development. If you like one of us can create a FR to check in the DB dump script in order to track it. Uzume 05:33, 17 February 2011 (UTC)
I also noticed there is other data that is basically public but seemingly not included in the dumps. For example, submission data is not included but history data (so far only for author updates) is. So in a local working copy (with different URL of course) this works (with "UNKNOWN" users): http://www.isfdb.org/cgi-bin/history.cgi?1440708 but this does not: http://www.isfdb.org/cgi-bin/recent.cgi Uzume 01:05, 25 February 2011 (UTC)
The submission table is huge and would triple (?) the size of the publicly available backups. It's not particularly useful for people trying to set up a local copy of ISFDB, so the trade-off doesn't seem to be worth it.
In other news, I have been reunited with the development server and plan to test your changes over the weekend. Ahasuerus 07:08, 25 February 2011 (UTC)
That is good to know on both fronts, though it might be nice if the submissions data was available somewhere even if not in the default download. You make it sound like my changes are the only ones (or almost). I noticed there was another tester at one time and there seems to be a means to offer to test things. Though methinks it is a slight conflict of interest to test my own submitted patches (not to mention I would not submit them if I hadn't already tested them so my further "testing" them probably won't find anything and be useless), I have been keeping my code up-to-date so anything else I have probably already been testing. But if you want or need someone to test specifically other changes, feel free ask. I suppose I should not just ramble and should make more development related comments in a better place than "Downloads". Uzume 14:58, 25 February 2011 (UTC)
Personal tools