Archive for the ‘Extranious Thoughts’ Category

Auto Comment Spamming Software has No Sense of Humor

Saturday, June 12th, 2010

Almost !00% of the comments that I receive on my various blogs comes from auto comment spamming software. I know this to be true because I can see it in the traffic logs. I see exactly the same comment on several blogs on the same day.

I have a website that echos email humor. I post things that come to me via email on this site. I had added pages to this site faithfully for a while, but had not added much for several months. It had become too much work to build a new page whenever something came in that I wanted to post.

I had started to use WordPress blogs on several sites and decided to put up a blog on the WebPickups site because of the ease of posting. Posts on WordPress sites tend to get indexed more rapidly than regular pages.

Now we come to the part about the auto comment spamming software. Since this blog has become one of my higher profile sites it is targeted by the auto spamming software. I have not fully investigated this type of software, but there must be a facility to add keywords to the search for posts to spam. Most of the comment spammers do not do a very good job with these keywords so the comments seldom bare any relation to the posts they target.

No place is this more evident than on the humor site. I sometimes feel that I should approve some of these comments just for the comedy value. If my visitors realized what was going on it would make the comment posters look like real idiots. Having a comment like ‘wow this is great information and very helpful to me’ on a joke could be classified as comedy on its own.

My advice to those thinking of using this type of software would be “don’t bother people with your spam”. This is a bit unrealistic just like giving the same advice to email spammers would be. People are lazy and the sales pages make these things sound like a gift from God. Just remember when reading a sales page that the page is designed to put money in the owners pocket. There are good products that can be moneymaking tools, but if it sounds too good to be true it probably is not true.

The second piece of advice if you must use the software would be to use very long tail keywords if the software allows for that. You will reduce the number of comments placed but will improve the chance of getting a comment approved. Don’t expect to get comments from auto commenting software approved on high page rank blogs. One might slip through occasionally but the owners of high page rank blogs have high page rank because they care, so are not likely to approve spam comments.

Google SafeSearch

Thursday, June 10th, 2010

Sometime recently I began to notice that my search results were appended with a line that said something like ‘this search took .xx seconds with safe search. This didn’t really impact me other than to say, “so what?”. Then the title of one of my blog posts from the Web Pickups site disappeared from the search results. Since the post in question does not display Google ads even though the code does show in the source code of the page, I thought that Google had deleted the page from the search index.

I finally noticed that the line about safe search had changed and one of the words was triggering the safe search filter. I saw a link for safe search to the right of the search box and clicked on it. This produced a drop menu that allowed me to turn safe search off. With safe search disabled my page was number two in the search results and the listing at number one had a caution for malware on the site.

I investigated safe search a bit further. From Google’s Safe Search page:

About Google’s SafeSearch filter

Use Google’s SafeSearch filter if you don’t want to see sites that contain pornography, explicit sexual content, profanity, and other types of hate content in your Google search results. While no filter is 100 percent accurate, SafeSearch checks a website’s keywords and phrases, URLs, and Open Directory categories to determine and filter out inappropriate sites.

The page goes on to list the various levels of protection and how they affect the search results:

In the SafeSearch Filtering section, choose the SafeSearch level you’d like to use:

  • Moderate filtering: This option excludes most explicit images from Google Images results but doesn’t filter ordinary web search results. This is your default SafeSearch setting; you’ll receive moderate filtering unless you change it.
  • Strict filtering: This applies SafeSearch filtering to both image and web search results.
  • No filtering: This option turns SafeSearch filtering off completely.

It is stated that Moderate is the default setting. What I was seeing fits the description of strict filtering. I did not really pay attention to what level of filtering was enabled when I saw the option to turn safe search off. I think that the default behavior that I saw was Strict rather than Moderate. According to the descriptions above I should have been seeing the page listed in the search results.

At any rate, the setting is saved in a cookie so when you conduct further searches you do so under your desired search behavior. Over all I think that it is good to have the options. I suspect that if Moderate were actually the default few people would even notice a difference or complain about it. I think that the Strict filtering is going a bit too far as a default but has value for those wishing to protect young minds (or easily offended older minds). I have no other complaint with the system as long as the setting is retained in a cookie for my computer and I am aware of the situation.

A Partial Retraction . . .

Wednesday, June 9th, 2010

I wrote a couple of days ago about a possible Google Slap. I have just discovered that the page in question has not been delisted. Ads are not shown on the page, but if you pay attention you can find the page through Google. In fact the page is at the number 2 position in the SERPs. The catch is that it contains the word boob. Even though there are multiple meanings for the word it is on Google’s naughty list.

I looked at some traffic logs today and noticed that there is a label above the search results that says the the word ‘boob’ has been filtered from the search by their ‘safe search’ technology. I looked the page over and saw a link for ‘safe search’ clicking the link produces a drop-box. The option to turn off safe search is available. When safe search is turned off the page appears in the number 2 position.

This may be a bit of an over reaction on Google’s part, but they can run their business however they wish. There is no reason not to show ads on the page, but with millions of sites and probably billions of pages of content manually checking each page where the algorithm noted a problem would be beyond even Google’s reach.

I had never heard of a single page slap, and it turns out that this is not the case. The title includes words with an intended double meaning to draw attention. It turns out that the double meaning causes the word to appear in Google’s filter list. Probably 99% of people that search on Google will never adjust the default behavior of the search results, so I miss the traffic opportunity that I might see otherwise. Another title would produce no results at all, so I will just take what little traffic does come from the post. The link could go viral but I will not hold my breath.

Meanwhile I have two or three other pages that are within the top five SERPs, so I am seeing some traffic to the site.

An Interesting Experience

Tuesday, June 8th, 2010

All of my websites were off line for a couple of hours this afternoon. Hostmonster pulled the plug until I corrected the problems.

On December 31, 2009 my web space was violated by a hacker. A visitor had notified me that they got a virus warning when trying to visit one of my sites. I checked it out and found the damage that the hacker had done (most of it, that is). I spent a couple of afternoons replacing files and cleaning up after the attack.

I had missed a few files, mostly not on public accessible pages.  When I first found the problems I had renamed a few files and left them on the server. I had re-uploaded .js files to the blogs that I had in operation at the time because the .js files were a primary attack point. I had also archived three of the sites that I did rebuilds to in the fall.

Not only had the hackers altered many files, but they had uploaded additional files in the blogs. When I uploaded the replacements any standard files were overwritten with good files, but the added files remained unchanged.

Hostmonster had received a couple of complaints. They had done a scan on my space and found 208 files that were still affected. I had to go through the site via my FTP client and delete or correct each file. This actually did not take as long as I first thought that it might, but it did take some time. The scan had produced a text file with all of the infected files listed. I deleted the archived files and some folders that were not needed on the server and corrected the other problems.

I then called back to Hostmonster support. They did another scan on my hosting space and it came up clean, so they removed their block and I am back in business.

I should have contacted them when I was made aware of the problem. Their scan report would have helped me to clean up the site properly at the time. My traffic is much higher now than it was then, so I would have missed fewer visitors by taking action at the time.

The hackers had full access to my space. That means that they had the password for the account. The first thing that I did when I learned of the problem was to change the password, and that stopped further intrusions. Sometime later Hostmonster changed all of the passwords and made them stronger. They were aware that some accounts had been compromised. The server farm may be just too extensive to run a complete scan of all the sites on their servers. That may have required too many resources and produced quality of service problems for them. For whatever reason (mainly that there were very few public files that I had missed) it took over 5 months for them to notice the problem.

There was very little wait time on either of the two calls that I placed to Hostmonster support today. I wish that the local power company was a bit more like this, but then they are a government sanctioned monopoly and their customers don’t have the option of voting with their feet. The technical staff was knowledgeable and courteous on both calls. The few times that I have needed to call support over the past two and a half years have been the same. I am well satisfied with the service that I have enjoyed from Hostmonster.

Am I Under a Google Slap?

Saturday, June 5th, 2010

I have a website, Web Pickups, where I post the contents of interesting emails from my email network. I recently added a WordPress blog to the site where I have posted the most recent additions. I do not post anything that I consider to be X rated as that would be against the TOS of my hosting account, but some of the things that come in via email may be considered in poor taste at best.

On the 26th of May I posted and email titled ‘Black Woman with With One White Boob and One Black Boob‘. (The title links to the post, if you are interested.) I had been amazed by the search traffic that this post has been generating. Yesterday was no exception. Today, not so much.

I went to my server last evening to check some the traffic log for another site and while I was there I peeked at the stats for Web Pickups. There were several searches for the page listed in the referrers line of the log. I usually look at these searches to see where I am actually coming up on the results pages. When I looked at the results pages I did not see the listing on the first page.

There are many searches for email titles of these emails that go viral and the results page usually shows my site in the top three results. Since the redesign of the search results page several options that used to appear right below the search box now appear at the bottom of the page. One of those options is to change to 100 results per page rather than the default 10 results per page. I clicked on the switch and checked the first 100 results. People seldom get past the first 10 unless they are very interested in a subject, even more rarely past the first 10 pages of search results. The page did not show up within the first 100 results. I did not check further, but it appears that the listing has been pulled or at least sent to the dungeon located many stair steps below the basement of the search results.

As somewhat of a confirmation of the slap, when I checked the page to get the link URL I noticed that of the three ad units and the text link unit at the bottom of the page only one unit on that page was showing a Google Public Service add. The space for the other two ad units and the text link unit are blank on that page. The good news is that ad units are showing normally on the other pages that I checked. I will be watching the traffic over the next few days to be sure that Google is still sending traffic to other pages.

This post displays some political humor. All of the characters in the image are fully clothed. There are many other search results that are returned that include the same search terms, and in this case, the slang usage does not even convey sexual innuendo with relation to the image, although without the image for reference many people would have the wrong mental image. This makes me wonder if Google made this decision or if it was done at the request of some higher power?? I take the fact that whatever slap there is has only affected the one page as an indicator that there may be some outside force in action (can you say censorship??). I suspect that if Google made the call that the effect might be more pervasive.

This particular site is more for fun than for profit. If there were huge traffic to the site it might pay its rent, but the people that visit the site are not in a shopping mindset. There is an occasional click when an ad catches somebody’s eye, but the site is an end point rather than a starting point in a buying process. If the site, as a whole, were slapped by Google I would just convert the space devoted to ads to one of the many other options available, though gaining traffic without the Google organic traffic would not be easy, and considering that this is not targeted shopping traffic, would not be of much value.

Do you have an opinion about this situation? Do you have any experience with a Google Slap, or know of a good slap story? If so please leave a comment!

Auto-Comment Spamming

Sunday, April 11th, 2010

I have put up several new blogs recently and have been spending more time clearing the comment spam from the queues. This blog is still the champion but several others are peddling hard.

I have also been paying a bit of attention to my server logs since implementing a new program on some of these blogs. There is some debate in my mind if that program is working, but I have not yet given it a fair trial. I think that the strategy will work in the long run if I keep with it, but I have some question about the short run stats. All of that is for another post, possibly on another blog.

What I have noticed by checking the server logs is that most of my comments do not come from site visitors. It became apparent that the bulk of the comments were posted directly to the comment page without the poster coming in through the front door. I first thought that there must be a list somewhere of the urls deep-linking to comment pages, and, more or less, that must be the case, but the list is contained in auto-spamming software packages or harvested by automated url harvesters.

Having seen hints in places around the web and coming to the realization that this auto-comment spamming was going on I spent a bit of time today with my friend Mr. Google. With his help I found several options ranging from free (with email opt-in) to around fifty dollars. There was one forum thread that ran to 34 pages before it was locked.

There may be a place for this software, but most users don’t take the time to target the result properly. The comments placed by these softwares will only be on topic by the sheerest of luck. With the example that I saw you fill out a copy of the WordPress standard comment form and turn the software loose. The software posts the same post to all of the urls on the list. The software does not care or know if the comment is relevant.

I have seen, and see on a daily basis, the same post on multiple blogs. I first thought that people were just pasting the posts in but to do that they would have to enter the blog through the front door, find the post, go to the comment page for the post, and fill in the comment form. The server logs would show that they entered the site through a landing page and then proceeded to the comment page. Checking for the existence of this class of software cleared the mystery. The people that will use this software are unlikely to put the time in to find blogs on their subject and to produce comments that, while being general in nature, are at least on topic for the selected blogs. I recently read an article on traffic building in which the author stated that the process of blog commenting could not really be successfully automated. I agree with his opinion.

Are you plagued by auto-comment spam on your blogs? You could enter this conversation with a real comment.

WP-Article-Fetch Report

Friday, April 9th, 2010

I have the WordPress plug-in WP-Article-Fetch working on a couple of blogs. I had mentioned it in an earlier post and I am seeing some interest from search traffic so I think that it is time to detail my experience with the plug-in a bit more.

I reread my earlier post and will add a few further observations to what I had said. Here is the portion of the earlier post that applies:

The second plug that I have just discovered is an auto-blogging software. I put one instance of the plug-in up last evening and was able (by jumping through several hoops) to get it working. I have only looked at a couple of the articles produced so far. One had very little text to the post and that appeared to be somebodies affiliate text link with the link stripped out. The other article at which I looked had a little more content. The formatting on the articles was not too good, but I may be able to tweak that in the css file. The software pulls images, but does not produce a margin around the image so the text runs right up to the image.

The plug-in, WP-Article-Fetch, serves the articles from a server maintained by the developers of the plug. They admit that the articles are scrapped from the web. They have an article spin software on the server. The one article with content was not very human readable. I suspect that this is a shortcoming of their spin engine, but there is a lot of content that is written by non native English speakers that does not read so well either. If it is the spin engine one would have to rewrite the articles if you want a site to be proud of, but there would be some possible value if the main objective is just to attract traffic to expose to your ads. In fact people may click on an ad just to get out of there and go somewhere that makes sense.

It would be hard to recommend this plug-in to anyone. Several of the articles have been incomplete, as I mentioned in the first paragraph of my earlier post. A few of the articles provide readable content, but many are either poorly written or lost something in the translation.

There are photos included with some of the articles. The developer recommends that you host the images locally and there is a check-box to instruct the plug to do that. I did check the check-box, but no image has ever been downloaded as far as I could tell. The images are pulled from where ever they have been found on the web. If someone removes an image, or changes its location, or prevents deep linking on their site there will be a missing image in the post. Unless images are pulled from a site that specifically permits this practice it is most highly impolite to pull images from someone’s hosting space.

There may or may not be an author’s resource box with the articles. When there have been resource boxes the links have generally not been clickable. There have never been any keywords included with any article. Often the keywords that come with articles are not the best, but if an article is optimized for a keyword it is nice to have the keyword for the blog tag.

The plug-in is designed to space the posts over a period of time. Several of the posts that have been scheduled have missed the schedule. When that happens you have to manually do the post. If you want tags and categories for your posts you will also have to add them manually.

As I have found with other options, the plug-in pulled a bunch of articles initially and then the volume dropped off. There are still occasional articles inserted in the blogs by the software, but they are infrequent. The other problem with the free auto-content software that I have investigated is the filtering. Some of the articles will fit the theme of the site and some will be way off base. With any of the software that I have used there is a good deal of management required to make the software of value.

Undoubtedly original content is the best choice for your site. If you are building a site with acquired free content that content must be tightly targeted to your site theme. I am pursuing the later course with a few blogs but there is not enough history at this point to draw conclusions. The conclusions will be fodder for a future article.

Traffic and Comment Observations

Tuesday, March 23rd, 2010

I have been watching my server traffic logs a bit over the last week or two. They are normally interesting to view once in a while, but I pay more attention to Google Analytics for my actual traffic reports.

I do think that Google misses some traffic, but most of the missed traffic probably shouldn’t count anyway. I think that there are people that hit the page and bounce before Analytics realizes they are there. Google instructs to place the code at the bottom of the page just before the body closing tag. That means that the Google script will be the last thing read when the page is rendered. The code is javascript, and as such could be placed in the head section of the page. It would be read before the page is actually rendered in that case and catch more of the flash thoughts.

At any rate, the information that I pick up from the logs tells much more about the server activity. In the server logs you see all of the activity of the web crawlers and spiders. There is a lot of robot activity that mostly never shows in Analytics because the pages are not actually opened, just downloaded to the indexing server.

On my blogs I also see some other traffic that does not show up in analytics. These are the tracks of the comment spammers. I have come to the conclusion that there is a list somewhere of direct addresses to comment pages of blogs. These comment spammers come in through the back door directly to the comment page. They leave their spam comments without ever reading a post or visiting the actual blog. Probably some or most of these are automated. They all leave their tracks in the server log.

This blog has now been hit over 6,000 times with comment spam. I don’t have to deal with most of that because my anti-spam software dumps many of them directly. It does leave a few for my decision. In most cases I choose ‘delete permanently’ but if there is some relation to an actual post I occasionally approve one.

I have recently posted a comment guideline page on all of my blogs. That will not make any difference to the comment spammers, but the fact that the page is there makes me feel better about deleting the spam.

Blog commenting is a viable way to attract traffic to a site if it is done properly. Comments must be real and address the subject of the post. They need to add value to the post. Too many people just post something quick like “great post” and are off on their merry way. These type of comments add no value to the post or the blog, and are seldom likely to drive any traffic to the posters address.

Do you have problems with comment spam? Do you ever look at your server traffic logs? Leave a ‘quality’ comment.

Investigating CMSs

Tuesday, March 16th, 2010

A couple of weeks ago I had found a CMS package called Dolphin7. I had downloaded the file and then not done anything more with it. This evening I decided to investigate a bit more. I went to the site and found their forum. There was some dissatisfaction with a recent policy change voiced on the thread that caught my eye.

I read through that thread and then did a Google search. The search term that I chose had negative connotations and there were plenty of responses. To be entirely fair I should search for positive comments, but what I read along with the comments on the home forum makes me think that I would be well advised to focus my energy in other directions.

The major open source CMS packages include Drupal, Joomla, and WordPress, the bed for this blog. I found a couple of comparison sites with notes on all three. There is positive comment on all three packages plus a couple of other options that were mentioned. All three of the biggies have areas where they shine. WordPress is recognized as the leading blogging software and can also be put into service for many website applications. Drupal is represented as the most flexible and powerful in many ways, but has a steeper learning curve and lacks some of the front end polish of the others. Joomla is the prettiest and has many strong features.

From the things that I read I might consider digging into Drupal a bit for a project that I have in mind. This is more of a tech job and the exterior is not quite so important. I may also investigate Joomla a bit more for some other projects where a slick interface is more important. It was an interesting overview of these software packages. Professional developers recommend all three depending on the requirements of the project.

I am reasonably comfortable working with my WordPress blogs. My hobby webmastering has given me enough of a basis in css and coding so that I am not lost. The other packages might push my learning a bit more, but that would not be a bad thing.

Have you used any of these scripts? Tell me your experience in a comment. Thanks!

Welcome to DST 2010

Sunday, March 14th, 2010

This is the day that the clock more closely aligns with my personal schedule. Daylight savings time has started for 2010, can spring be far behind?

I have been in a business that is primarily night and weekend work. My schedule is often the reverse of the 9 to 5 crowd. Even when not working I am more the night person than the day person. Having the extra hour of daylight in the evening is a great boon. During the winter darkness falls before I really get moving. Now  I will have more time in my day to get the daylight type things done.

The question about spring reflects this years weather pattern. There have been more heating days than normal here this year. I have lived in Florida for over half my life. I am a warm weather person. Usually there are a few cold days here, but there are nice, warm days in between. Not so much this year. We have seen few days since the first of the year that even reached the 70′s for the daily high and the nights have been colder as a consequence. I want my Florida Back!!


privacy policy | terms of service | about us