Monday, December 19, 2011

Documentation on handling larger database

Hi Techies,
       I created a documentation for handling larger database. Make use of it & share any other tips which I missed out.

Database server: Buy a server that can handle larger data's

Database engine: We can use either MyISAM or INNODB.

Stick to your basics right:
1)     Create table with proper indexing & primary key

2)     Don't use COUNT * on Innodb tables for every search, do it a few times and/or summary tables, or if you need it for the total # of rows, use SQL_CALC_FOUND_ROWS and SELECT FOUND_ROWS()

3)     Avoid using IN(...) when selecting on indexed fields, It will kill the performance of SELECT query.

4)     Use index to filter rows while searching

5)     Don’t use * in select when you’re not going to retrieve 60% of the columns. Make it specific like (userid, username, doj, lastlogin etc)

6)     Don't Index Everything

7)     A NULL data type can take more room to store than NOT NULL

8)     Use mysql_pconnect instead of mysql_connect in PHP

9)     Don’t use parenthesis in the select query instead use the table name as prefix for the column name.

10)  In the pagination don’t use offset, Just use Limit N. It makes pagination faster.
E.g.: With offset Select * from pages limit 10,20
Without offset Select * from pages where id > 10 limit 20

11)  Use INET_ATON and INET_NTOA for IP addresses

Optimization Tips:
1)     Use OPTIMIZE for each table
a.      E.g. OPTIMIZE Table tablename
                                                              i.      This will work only in MYISAM Engine.

                                                            ii.      While optimizing the entire table will be locked, so it should be done only when the site is offline & maintenance period

2)     To test the server speed, Just execute the following query in mysql
a.      SELECT BENCHMARK(1000000,10+10);

b.      The above command executes the expression 10 + 10 one million times. This will return 0 as result, but the execution time should be noted.
This command as to run different time of the day to check how the server handles the request.

3)     Create index for the table, we should make proper indexing to the fields which we’re going to search for the results.

4)     Before making a select query check it with Explain function & check whether the query is optimized to its best. This will be very useful when we use for complex queries.

5)     Use stored procedures to avoid bandwidth wastage

6)     Partition your database/table

7)     Delete data that you don’t need. Tables will grow, so monitor table sizes regularly.

Secure the Data’s:
While inserting or fetching data’s from the website, we need to be careful. We should follow the steps mentioned below
1)     Use mysql_real_escape_string() for the datas that has user interaction like Inserting fields & query strings

2)     Use POST method as much as possible

3)     Filter each inputs to make it secure. Use strip_tags().

4)     When you're getting the values from query string, use (int) function for getting integer values & if you want to get a string or characters use substring to get the values to a certain limit (For example: An hacker can pass ascii code to the query string & can access the database. So, we can limit upto 10 characters in the query string. If we do so, sql injection will be prevented).

5)     If you're using AJAX, pass the values by post method.

6)     Don't allow php files & other executable files in the file upload option. If you do so, the hackers can write their script in php file & upload it to your server & can access any data’s that they want.

7)     In production path don't print the mysql error code or sql query. If any error displayed then the hackers can find out the table structure for the website & can use it accordingly.