How to build a simple Chrome Extension

For those who just want to download a working example, please use this link. The content was downloaded from lifehacker.com and modified to be compatible with Google’s Manifest 2.0.


Last week I needed to write a very simple extension for Chrome.
I started on this page, but so many things were broken (links, files) that I had to search for another solution.

Finally I found a pretty good step-by-step process Lifehacker’s website. Usually the easiest way for me to develop something is to take a working solution and modify it according to my needs. LH’s extension was working, but the problem was that it was using an old Manifest 1.0 and Google would not accept it to the store. So the task was to rewrite it compatible with Manifest 2.0.

Step 1: Change manifest.json.

First thing here is to add this piece of code at the end of your manifest.json
,
"manifest_version":2

When you do this – Google is going to complain about “background_page”, since it’s not compatible with Manifest 2.0. Solution: change it to

"background": {
"page": "background.html"
},

Step 2: move all inline Javascript to your .js files.

Google change its security policy so you can’t run the scripts directly from the code. I’ve created two separate files – background.js and popup.js and put all inline scripts inside.

That’s it. You can download my working code here.

One more thing that I found extremely useful is Chrome’s console to debug the plugin. You can use it for both your popup and background pages.

Exactly how my sales process looks like

How to get motorcycle driver’s license in California

I just recently got my M-1 driver’s license in California and I wanted to share my experience.

Here’s the step-by-step process how to get your license in the easiest possible way:

  1. Sign up for motorcycle courses. Even if you know how to ride and have a lot of experience – DO THAT. I went to these guys and was extremely happy with the results. The thing about these courses is that you will get their certification which will allow you not to take the test at DMV which is MUCH harder. Plus you will get a good discount on your insurance. Plus you will get a lot of knowledge about motorcycles and how to ride them safely. If you’re just starting – these courses are MUST have. They will teach you how to ride a bike even if you’ve never seen one before.
  2. Courses have 2 parts: evening theory class and 2 days of riding for 6 hours. Do that first before going to DMV, you’ll learn A LOT. You will have to pass written test at the end of that evening class and also show your skills after 2 days of riding. It should be easy enough – every one from my group successfully passed.
  3. Safety is #1 importance for them, so pay attention! Some people were not admitted to ride a bike, because they did not wear proper boots (have to cover your ankles), so they had to come some other day. Don’t worry about the helmet – you can use one of theirs.
  4. After the courses you will get a certificate that you should bring to DMV.
  5. Get the handbook and read it. Seriously, read it! The test is not that easy and ALL questions are in this handbook.
  6. Go to DMV to take the written test. You will have 3 attempts. There are 25 questions in the test. You can make 4 mistakes on your first attempt and only 2 on your 2nd and 3rd attempts. I highly recommend you come prepared (read the handbook).
  7. That’s it. Now you have to give them your orange certificate and they will mail you your new DL.
  8. Don’t forget to mention your certificate when you get your bike insurance – it will save you some extra money.

Drive safely!

How to quickly copy MySQL InnoDB database without using MySQL dump

I had a problem recently when I had to copy a pretty big InnoDB database from one server to another.

Doing import-export through MySQL dump was taking forever, so I found this easy way to do that.

  1. First of all, stop MySQL on both servers (service mysql stop)
  2. You need to copy several files in your MySQL DB folder. Mine was /var/lib/mysql/
  3. Locate the folder with the name of your database and copy that. I highly recommend rsync for copying files between two servers.
  4. Copy ibdata1, ib_logfile0, ib_logfile1, mysql_upgrade_info
  5. Start MySQL (service mysql start)

That’s it! Now you have your entire DB copied and it probably took 100x less time.

Please note that you must have the same architecture on both machines. I.e. if you have 32-bit on one server and 64-bit on another – you will have to use MySQL dump.

How much is your idea worth?

Yesterday I read a post on TechCrunch that Walmart bought ‘Social Calendar’. In most cases I ignore this kind of news, but this one was particularly interesting for me.

The thing is that several years ago I was working at a small company and we built the same application. Yes, I mean exactly the same. I spent several months developing it, we were very careful about our ‘launching strategy’ and finally the day had come when we opened it for public. I was really excited. What happened next? Well, nothing really. Several months later we had 100 customers, most of whom were our friends.  Did that mean that the idea was not very bright? For my boss that was definitely the case.

The lesson that I learned yesterday is the same lesson that I learn every day by looking at people who say “I’ll tell you about my idea but you have to sign this NDA”. Your idea is worth nothing. NOTHING. Exactly 0. What’s worth millions (sometimes billions) of dollars is the implementation. Our company and ‘Social Calendar’ had the same idea, but what the difference was in the execution. The guys from ‘Social Calendar’ could execute it right and they are probably millionaires by now. We could not. That makes all the difference in the world.

So, “How much is your idea worth?”. Zero.

Apple/Google product vs. yours

I’m sure a lot of people know what this picture is about 🙂

How to write a simple web crawler in PHP

These days more and more websites use information from other sites to populate their content.
One of the best ways to do it is a web crawler. Web crawler is a script that can browse thousands of pages automatically, parse out the information you need and put it into your DB.

Here is an easy way to write a simple web crawler in PHP.

Step 1. You will need CURL. I do not recommend using functions such as file_get_html or file_get_contents. Your crawler will probably have to query thousands of pages and connection is a bottleneck here. I’ve made several tests and CURL works significantly faster.

Step 2. You will need a list of pages that you need to query. Very often if you need to scrap the information from one website – you will need to write 2 crawlers: One will get all the links that you need and the other will go through all the links to get and parse the information. One of the best ways to get the list of links is to look at the sitemap. Sitemaps have 2 huge advantages:

  1. They are very usually located in http://yourwebsite/sitemap.xml – so no problem to find them
  2. They are in XML format – it’s very easy to parse XML using for example PHP’s internal library.

Let’s use some example to make things simpler. Let’s say I want to get all authors that ever posted something on TechCrunch. My first destination is: http://techcrunch.com/sitemap.xml. As was mentioned above, XML is really easy to parse, so now we have the list of all pages on TechCrunch.

Step 3. You need to write a function that could return CURL output. Luckily I did that for you:


function getUrl($url) {
	$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml, text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; 
	$header[] = "Cache-Control: max-age=0"; 
	$header[] = "Connection: keep-alive"; 
	$header[] = "Keep-Alive: 300"; 
	$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; 
	$header[] = "Accept-Language: en-us,en;q=0.5"; 

	curl_setopt($curl, CURLOPT_URL, $url); 
	curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Ubuntu/10.04 Chromium/6.0.472.53 Chrome/6.0.472.53 Safari/534.3'); 
	curl_setopt($curl, CURLOPT_HTTPHEADER, $header); 
	curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate'); 
	curl_setopt($curl, CURLOPT_RETURNTRANSFER, true); // very important to set it to true, otherwise the content will be not be saved to string
	$html = curl_exec($curl); // execute the curl command
	return $html;
}

NOTE: more often than not people want to stay invisible during crawling. It’s understandable – nobody wants their content to be stolen without their permission. If you want to be in a ‘stealth mode’ – you need to use special headers.

Step 5. Now you need to go through all the pages and get the authors.
Let’s say all my links from Step 2 are saved in $links array. Now we do this:


foreach($links as $url) {
	$html = getUrl($url); // the function from Step 3
	$author = getAuthor($html); // getAuthor is the function that parses the HTML and returns the name of the author.
	addAuthorToDB($author); // put it to your DataBase
	sleep(1); // one second break
	echo $author."\n"; // it's good to see the output while the screen is running
}

Many developers make a mistake running the script from a browser. There are several reasons not to do that and first of them is that your browser will almost certainly timeout. In order to avoid it run your PHP script from command line. If you use my example below you can enjoy the process by seeing another author’s name on a new line each second.

This is it! All you need to do is write two functions: getAuthor($html) – the function will parse the HTML and return the author name. I will show you how to do that in one of my next posts. The second function addAuthorToDB($author) is a simple DB insert. You can have whatever you want here instead. The basic rule is that you don’t want to work with the data coming from the crawler immediately. Save it to your DB first.

Comments? Questions? Please post it in the comments section below.