On those recent site troubles….

Sorry the site was down twice yesterday. We are still figuring out what happened. It’s probably innocent, but we won’t know for sure until we’ve gone through the logs which are monstrous.

I suspect that restoring all 100k comments (lost to data corruption after the transfer of my blog) may have set the traffic over the limit with recaching from many automated spider bots and what-not’s that needed to update all 824 pages and 100,000 comments at once. The log files of the action on my site the last few days are huge, so it will be some time to sort out what happened. I gather the servers may have thought they were under attack and shut down.

When trouble strikes the site, check out my Twitter account for info. Don’t forget to follow @JoanneNova.

So why did I move the site in May?

Last year we did get a denial of service attack in June. We decided with that and the increasing costs, we needed to move the site to cheaper, more secure US servers. That move, and all the traffic went very smoothly, but cost more than $6000 over the year. And even on the cheaper servers, the ongoing bandwidth was still costing $300 per month, so when a dedicated skeptic, who also managed other wordpress blogs as a business, offered to help out with very reduced costs, I could not help but say “Yes Please”.

Hence the site was moved again to another server in the US last month. Because the site is so large not surprisingly there have been a few hiccups in the weeks since we switched it over. We should be fine, though there may be more hiccups in the coming days. This new server location will be a lot cheaper to operate in the long run.

If I was funded by Exxon we wouldn’t have quite so many drama’s in the switch, because we would be paying commercial rates which would be around $10,000 per annum I hear.  😉

I have no doubt the new web manager is doing an excellent job, but obviously he has paid work to attend too, and is packing in this new large role in spare time in between.

Apologies for the inconvenience. I trust readers will understand, and if we do turn up anything untoward or unexpected about the “Suspended” notice, I’ll post an update here. For the moment, assume it was one of those things that happen to large complexes of software and databases.

Cheers,

Jo

Here’s a brief synopsis of the events which took place in relation to this website – David T. (web thingy guy).

TLDR; not a government conspiracy.

Jo contacted me about the enormous costs of keeping her website up and I related that I purchase hosting bandwidth at wholesale prices out of the US. So, we took the site down for a couple of days and moved servers. Everything seemed to be OK. Unfortunately, due to unforeseen technical issues unrelated to the new website hosting solution, we lost a whole bunch of data about a week after the move. This issue has since been resolved.

We ended up losing almost a weeks worth of data since the previous backup. That data has been recovered but is still yet to be integrated back into the website. The server went down yesterday as a probable result of being re-crawled by Google though, I’m still analyzing the logfiles to confirm. And seeing as the logs grow by over a megabyte everyday, the analysis is slow going. For those who are not technically inclined here’s a short explanation.

Basically, this website is dynamically generated out of a database. Upon request, the result of the generated page is saved as a cached file. This is so the CPU and database don’t have to do any additional work to render the page on every additional request. Only when someone posts a comment. When Google visits your website, it requests as many pages as it can find. With our comments restored our cache needed to be updated so the CPU got hung up dynamically generating every page across the website from the database rather than as a cache file.

So, as a result, my technical support in the US disabled this website as the CPUs were hanging to the point of almost crashing the server. We stepped through a number of website upgrades and efficiency measures during that afternoon until I was satisfied the website would be OK. I checked the website before going to bed and all seemed fine…

At 2:30am we got another hit on the CPU. What I love about my web host providers in the US is their insane paranoia when it comes to security. The next CPU hang triggered an automated script which disables the web account until action can be taken to rectify the problem. This exists to terminate a denial of service attack so that every account running off the server doesn’t lose their website, and more importantly, their email. First thing this morning I have opened my inbox and discovered Jo’s site has gone down. Within an hour we put a resolution in place and arranged an action plan should the same event happen again. Short of writing my own bot and crawling the website, I don’t know if another crawl will take the site down. But, there is now a plan in place to gather data at a network level(beneath the server) to identify and implement a permanent fix should our current efforts be inadequate.

The state of the website at the moment can best be described as functional. We have recovered most of the comments lost and are working to splice those that are still missing into the database. It has been a complex process of weaving some database building scripts and webpage scrappers while firing them onto flat database dump files to create some execution scripts to run on the live server. This is all to marry three separate datasets together, all with conflicting references and placeholders. Not a simple issue.

I understand that 12 hours can be an eternity in internet time so, whenever any issues arise I do my best to resolve them as soon as possible. I appreciate your understanding in this matter and hope that answers any questions you may have concerning the various issues related to this site.

9.7 out of 10 based on 41 ratings

16 comments to On those recent site troubles….

  • #
    fenbeagleblog

    Nice to see you back up and running Jo. Your site is a valuable service, may it long continue.

    Meanwhile, it’s business as usual in Britain, and money continues to make the whirls go around…

    http://fenbeagleblog.wordpress.com/2012/06/17/ting-a-ling/

    00

  • #

    David Thomson,

    When you find a workable solution for blocking crawlers/scrapers, please let Lucia know. The Blackboard black-holes any IP addresses from where a SUSE Linux user run an out-of-the box Firefox browser to access that blog.

    P.S. A workable solution doesn’t refer to the untrustable “User Agent” information provided by the client accessing the web server.

    00

  • #
    Jimmy Haigh

    Welcome back…

    Tip jar topped up. Keep up the excellent work.

    00

  • #
    Vess

    “To Err is Human; To Really Foul Things Up Requires a Computer” – an anonymous quote from 1969. And computers are so much better nowadays…

    Glad to see the site back online.

    00

  • #
    Simon

    We understand and are tolerant ofthe problems. Don’t fret; keep up the good work.

    00

  • #
    ThomasJ

    Very good to see you back, Jo! And thanks for your info, David!
    There seems to some problem, though, at the function ‘thumb up / down’ as when clicking one (I, at least) gets moved to TopOfPage. Is this only with me…?

    Brgds from Sweden
    /TJ

    00

  • #
    JakartaJaap

    Yo! Great to see you back on line…maintain the rage!

    00

  • #
    Juliar

    It is good that the site is back up and running as now you will be able to analyse Green groups’ claims that the 5.5 tremor in Victoria this evening was caused by Global Warming! 😀

    00

  • #

    Welcome back! The folks at WUWT were getting all bent out of shape, you’d think they had run out of chocolate or something.

    From the visit logs to my lowly site, I’m amazed at how many crawlers are out there and how many don’t identify themselves as such. It must be quite a headache for ISPs hosting large sites.

    00

  • #
    Andrew McRae

    Well when you put it that way… damn fine effort Mr T.

    All my experience has been with Oracle, MSSQL, and Derby. Tangled with PostgreSQL once, but never tackled MySQL because of its dodgy reputation. I’m sure the only reason anyone used it is because it was free and was a cheap extra with PHP hosts. With any other RDBMS you would have a defined dump format for export/import to make moving data easier. Then you do all your data patchups with SQL, no sweat. (Admittedly I’ve had the luxury of all my apps being behind the corporate firewall.)
    If someone had told me a few weeks of comments had been lost since the last database backup I would have just written them off as gone forever. Flat files and HTML scraping?? Yikes! Errortopia! Good work recovering those under pressure.

    I reckon you’ve earned a short break.

    So how was it? 😀

    00

  • #
    Chuckles

    Glad to see you’re recovering 🙂

    Consider using Cloudflare to front-end the site, filter traffic and do some extra caching. It’s free and works well to minimise sudden loads.

    00

  • #

    Hey Jo,

    Sorry I have been absent lately. But it was a good time to come back (I saw the site problems over at WUWT), as I also noted your moving expenses. OUch!

    Left a tip for you. Hopefully it will help a bit with those moving expenses. And you know about moving. NOTHING ever goes as planned.

    00

  • #

    Hi Guys, sorry about the rating system. I haven’t been looking at the comments in this blog. I’m currently working 80-100hr weeks on a large software project which is almost at the due date. When I can get more time on the website we should be back to full data integration and plugin functionality.

    00

  • #
    Roy Hogue

    We survived it with no real trouble. Less expensive, more secure and administered by friendly forces are all much to be desired. So good move!

    00

  • #
    Angry

    Hi Jo,
    There seems to be a problem regarding clicking on the Thumbs Up and Thums Down icons.
    All that seems to happen is that the web page simply reloads and the count is not incremented.
    Cheers

    00