RationalSpace

Archive for the ‘Databases’ Category

How to Insert Duplicate data in table with auto-increment Field

leave a comment »

Sometimes, there is a requirement where the data needs to be duplicated in a table. This would be simple in a case where auto-increment attribute is not set on a ID column. But if set, auto increment ensures that all rows have unique and incremental ID.

So in this case the easiest way to do this is :

  1. Create temporary table with the data that you want to insert :
    create table mytab_temp select * from mytable where <some condition>.
    Now the important thing to note is that when you create a table like this its indexes do not get copied. So this new table will not have the auto-increment attribute attached to the ID column.
  2. Now comes the trick – make the value of this column as 0.
    update table mytab_temp set id=0;
  3. Insert the rows from this temp table back to the table that you wanted to have duplicated data –
    insert into mytable select * from mytab_temp
    . Now this is a cool feature of mysql – If you insert rows with id 0, and the id column is set to auto increment , it automatically adjusts the ids with auto-increment values.

So we are done:)

Advertisements

Written by rationalspace

April 8, 2014 at 12:54 pm

Posted in Databases

Maintaining order in MySQL “IN” query

leave a comment »

This is quite a common issue that you may have come across. In general, the order of any SQL query is arbitrary unless you specify an order with an ORDER BY clause. Sometimes, what we need is the ability to return results in the same order in which you passed arguments in the IN clause.

SELECT * FROM foo f where f.id IN (2, 3, 1);
I then get the following result

+—-+——–+
| id | name |
+—-+——–+
| 1 | first |
| 2 | second |
| 3 | third |
+—-+——–+
3 rows in set (0.00 sec)
As one can see, the result is ordered by id. What we want is to get the results ordered in the sequence we are providing in the query. Given this example it should return

+—-+——–+
| id | name |
+—-+——–+
| 2 | second |
| 3 | third |
| 1 | first |
+—-+——–+
3 rows in set (0.00 sec)

We can use either FIELD or FIELD_IN_SET for this:

SELECT * FROM foo f where f.id IN (2, 3, 1)
ORDER BY FIELD(f.id, 2, 3, 1);

MySQL FIELD() returns the index position of the searching string from a list of strings. If search string is not found, it returns a 0(zero). If search string is NULL, the return value is 0 because NULL fails equality comparison with any value.

Written by rationalspace

February 26, 2014 at 3:30 pm

Using PDO (Php Data Objects)

leave a comment »

A php programmer typically starts coding by learning how to use mysql_* or mysqli_* functions for database access. With Php 5.1, there is a better way of doing db calls using PDO or PHP Data Objects.  Lets analyse why:

  • Better portability – You can use it over many db layers like MSSql, Mysql etc. You need not change the code each time.
  • Prevents SQL Injections – With PDO you can uses binded params and that will prevent most sql injection attacks.
  • Object Oriented Interface

It is fairly simple to start using PDO

  • Making connection 
    try{
    $dbh = new PDO('mysql:host='.DB_HOST.';dbname='.DB_DATABASE, DB_USER,DB_PASSWORD,array( PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION, PDO::ATTR_PERSISTENT => false,PDO::MYSQL_ATTR_INIT_COMMAND => 'set names utf8mb4'));
    }catch(PDOException $ex){
    file_put_contents('/tmp/PDOErrors.txt', $e->getMessage(), FILE_APPEND);
    }
  • Preparing Query    
     $sth = $dbh->prepare("select from employees where name=:name"); 
    Here :name is a parameter that you need to bind with the value.
  • Bind param to value  
     $sth->bindValue(':name',$name);
  • Set the mode for getting values – could be PDO_ASSOC, PDO_OBJECT,PDO_ARRAY  
    $sth->setFetchMode(PDO::FETCH_ASSOC);
  • Execute the query 
    $sth->execute();
  • Fetch records  
    while($record = $sth->fetch()) {
    //do something
    }

Written by rationalspace

May 8, 2013 at 2:41 pm

Posted in Databases, OpenSource Tech

Tagged with , , ,

Wonders of mysql auto increment column

leave a comment »

Auto increment is quite a useful attribute for a column in MySQL. You need not take care of adding unique ids to data manually. Just make the id field auto-increment and the it gets a unique value with each insert.

For example , if we have a table with 3 columns and 1 auto increment unique id

We can simply do insert into table (col1,col2,col3) values (val1,val2,val3) 

So while doing an insert you  just list down column names for which you actually have data , the id column gets the value automatically.

But sometimes, we do not want to even list the columns. For a case where you are getting data for a table that has 100s of columns, writing each column name in the query can be quite exhausting. So how do you insert your query making sure the id field also gets filled?

You can achieve this by simply inserting 0 or NULL for auto increment field.

insert into table values (0,val1,..............................val100)

This is awesome!

Things to be careful about : auto increment “INT” (2^31 -1 ) column does have an upper limit of around 2B rows. If you ever exceed this, you may want to change the data type to BIGINT (2^63 -1)

Also, a delete creates holes in the table since the auto increment keys do not get re-adjusted. Whereas, a truncate resets it.

You can also change the auto increment start value to something other than 1

ALTER TABLE table_name AUTO_INCREMENT = 100;

 

Written by rationalspace

March 12, 2013 at 12:52 pm

Posted in Databases

Tagged with ,

Restore a particular table from MySQL dump

leave a comment »

A couple of times I have found myself in a situation where I have messed up a table I was working on and that needs to be restored.

Now, in most systems, we do have a daily database dump and we can restore using that but then we don’t want to dump the whole database again  as other tables may have been updated after we did the last dump!

I was looking for a tool to extract a table from the whole db dump and came across this fantastic script:

http://www.tsheets.com/downloads/oss/extract_sql.pl

It can list your tables in the dump, and extract table from your dump file

To extract the info needed to restore table ‘mytable’ from the mysqldump file ‘mydumpfile’, you’d run:

perl extract_sql.pl -t mytable -r mydumpfile > mytable.sql

And then you can simply restore the table by

mysql -u<username> -p<password> <database> < mytable.sql

Written by rationalspace

March 6, 2013 at 5:23 pm

Posted in Databases

Tagged with , ,

Load Bulk Data In MySQL

leave a comment »

Recently I was assigned the task of loading data of the order of 16 million records in a MySQL table. Now if you issue 16 M inserts from a script in a normal database, it would take hours to do the same depending on several other factors like CPU of the machine etc.

You may want to do a bulk insert by combing all records into one SQL statement, but then if you are using PHP there is a limit on the memory that it allocates to a script. “memory_limit” is defined in php.ini

A better way to handle this is by using LOAD DATA command. Output your records to a CSV or a text file where each line has the column values and then load this file into the database.

Make sure your file resides in the same location as your database’s data.

LOAD DATA INFILE ‘file.csv’ INTO TABLE my_table COLUMNS TERMINATED BY ‘,’ LINES TERMINATED BY ‘\n’ (col1,col2,col3);

Voila! Now it takes just a few minutes to load 16M records. 🙂

Written by rationalspace

March 5, 2013 at 4:26 pm

Posted in Databases

Tagged with ,

Non-relational Database Systems – NoSQL

leave a comment »

There is a paradigm shift in web applications that were there in early 1990s and now. Most of the earlier web applications were content based sites where a few people used to update / create content and a lot many people used to consume it. So there were fewer writes and more reads. Now with the advent of social networking, more and more people are creating content on the application. There is a shift from read-only architectures to read/write or write heavy architecture. In such a scenario, traditional relational databases do not suffice as the number of users grow. In a large database to make the queries faster, the logic of relating the tables is being implemented at application level. By doing this, the features of a relational db are not being used and the databases are just becoming stores.

Hence was borne a system where we do not maintain relations and is much more flexible and scalable. The NoSQL movement is a used to describe the increasing usage of non-relational databases among Web developers. This approach has initially pioneered by large scale Web companies like Facebook (Cassandra), Amazon (Dynamo) & Google (BigTable) but now is finding its way down to smaller sites like Digg. This movement was initially called the No-SQL as in an alternative to SQL but later was renames as Not Only SQL 🙂

Salient features:

  1. It isn’t a relational database.
  2. lack of fixed schemas
  3. limited support for rich querying.
  4. Offer little functionality beyond record storage (e.g. key–value stores)
  5. Highly optimized for retrieval and appending operations
  6. Has no limit on number of columns , size of data
  7. Development cycle might be faster as one does not need to spend too much time on data modelling and schema design.
  8. Has a distributed, fault-tolerant architecture. Several NoSQL systems employ a distributed architecture, with the data held in a redundant manner on several servers. In this way, the system can easily scale out by adding more servers, and failure of a server can be tolerated. This type of database typically scales horizontally and is used for managing large amounts of data, when the performance and real-time nature is more important than consistency

Written by rationalspace

February 14, 2013 at 5:51 pm

%d bloggers like this: