如何比较两个数据表
有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。
我们先假设一下有如下表结构:
CREATE TABLE jajal
(
  user_id integer NOT NULL,
  first_name character varying(255),
  last_name character varying(255),
  grade character(1),
  CONSTRAINT jajal_pkey PRIMARY KEY (user_id)
)
然后,我们有两张表——jajal和jajal_copy,其内容如下:
目录
jajaljajal_copy使用FULL OUTER JOIN使用NATURAL FULL OUTER JOINMySQL SQL 代码PostgreSQL 下的SQL语句
jajal
user_id
first_name
last_name
grade
1
Some
Dude
A
2
Other
Guy
B
3
You are
Welcome
B
4
What
Other
A
5
INeed
You
C
6
Mixed
Nuts
Z
7
Kirk
Land
B
8
Bit
Shooter
A
9
Sun
Microsystem
C
10
Extra
Fancy
B
jajal_copy
user_id
first_name
last_name
grade
1
Some
Dude
A
2
Other
Guy
B
3
You are
Welcome
B
4
What
Other
A
5
INeed
You
C
6
Mixed
Nuts
C
7
Kirk
Land
B
8
Bit
Shooter
A
9
Sun
Microsystem
C
10
Extra
Fancy
B
要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。
下面是具体的SQL语句:
使用FULL OUTER JOIN
SELECT
  *
FROM
  jajal j
  FULL OUTER JOIN jajal_copy jc ON jc.first_name = j.first_name
   AND jc.last_name = j.last_name
   AND jc.grade = j.grade
   AND jc.user_id = j.user_id
WHERE
  j.user_id IS NULL
  OR jc.user_id IS NULL
运行结果如下:
user_id
first_name
last_name
grade
user_id
first_name
last_name
grade
[NULL]
[NULL]
[NULL]
[NULL]
6
Mixed
Nuts
C
6
Mixed
Nuts
Z
[NULL]
[NULL]
[NULL]
[NULL]
使用NATURAL FULL OUTER JOIN
关于natural join,你可以看看Wikipedia是怎么说的。
SELECT
       *
FROM
       jajal j
       NATURAL FULL OUTER JOIN jajal_copy jc
WHERE
       j.user_id IS NULL
       OR jc.user_id IS NULL
运行结果如下:
user_id
first_name
last_name
grade
6
Mixed
Nuts
C
6
Mixed
Nuts
Z
MySQL SQL 代码
MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。
SELECT
*
FROM
jajal j
LEFT JOIN jajal_copy jc ON jc.first_name = j.first_name
AND jc.last_name = j.last_name
AND jc.grade = j.grade
AND jc.user_id = j.user_id
WHERE
jc.user_id IS NULL
UNION ALL
SELECT
*
FROM
jajal j
RIGHT JOIN jajal_copy jc ON jc.first_name = j.first_name
AND jc.last_name = j.last_name
AND jc.grade = j.grade
AND jc.user_id = j.user_id
WHERE
j.user_id IS NULL
或者你更喜欢NATURAL JOIN 版本
SELECT
*
FROM
jajal j
NATURAL LEFT JOIN jajal_copy jc
WHERE
jc.user_id IS NULL
UNION ALL
SELECT
*
FROM
jajal j
NATURAL RIGHT JOIN jajal_copy jc
WHERE
j.user_id IS NULL
当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:
DELIMITER $$
CREATE PROCEDURE db_schema.tablediff
    (schema_name VARCHAR(64), table1 VARCHAR(64), table2 VARCHAR(64))
BEGIN
    DECLARE done INT DEFAULT 0;
    DECLARE sql_statement TEXT DEFAULT '';
    DECLARE sql_statement_where TEXT DEFAULT '';
    DECLARE sql_statement_pk TEXT DEFAULT '';
    DECLARE col_name VARCHAR(64);
    DECLARE col_name_cur CURSOR FOR
        SELECT
            COLUMN_NAME
        FROM
            information_schema.COLUMNS
        WHERE
            TABLE_SCHEMA = schema_name
            AND TABLE_NAME = table1
    ;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
    OPEN col_name_cur;
    traverse_columns: LOOP
        FETCH col_name_cur INTO col_name;
        IF done THEN
            CLOSE col_name_cur;
            LEAVE traverse_columns;
        END IF;
        SET sql_statement_where = CONCAT(sql_statement_where,
            ' AND a.', col_name, ' = b.', col_name);
        SET sql_statement_pk = CONCAT(sql_statement_pk,
            'AND b.', col_name, ' IS NULL');
    END LOOP;
    SELECT
        COLUMN_NAME INTO col_name
    FROM
        information_schema.KEY_COLUMN_USAGE
    WHERE
        CONSTRAINT_SCHEMA = schema_name
        AND CONSTRAINT_NAME = 'PRIMARY'
        AND TABLE_NAME = table1
    LIMIT 1
    ;
    IF col_name IS NOT NULL THEN
        SET sql_statement_pk = CONCAT('AND b.', col_name, ' IS NULL');
    END IF;
    SET sql_statement = CONCAT('SELECT * FROM ', schema_name, '.', table1, ' a LEFT JOIN ', schema_name, '.', table2, ' b ON TRUE');
    SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
    SET sql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ', schema_name, '.', table1, ' b RIGHT JOIN ', schema_name, '.', table2, ' a ON TRUE');
    SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
    SET @s = sql_statement;
    PREPARE stmt1 FROM @s;
    EXECUTE stmt1;
    DEALLOCATE PREPARE stmt1;
END$$
DELIMITER ;
PostgreSQL 下的SQL语句
下面是PostgreSQL的一个存储过程:
CREATE OR REPLACE FUNCTION tablediff (
    IN schema_name VARCHAR(64),
    IN table1 VARCHAR(64),
    IN table2 VARCHAR(64)
) RETURNS BIGINT AS
$BODY$
DECLARE
    the_result BIGINT DEFAULT 0;
    sql_statement TEXT DEFAULT '';
    sql_statement_where TEXT DEFAULT '';
    sql_statement_pk TEXT DEFAULT '';
    col_name VARCHAR(64);
    col_name_cur CURSOR FOR
        SELECT
            column_name
        FROM
            information_schema.columns
        WHERE
            table_catalog = schema_name
            AND table_schema = 'public'
            AND table_name = table1
    ;
BEGIN
    OPEN col_name_cur;
    LOOP
        FETCH col_name_cur INTO col_name;
        IF NOT FOUND THEN
            EXIT;
        END IF;
        sql_statement_where := sql_statement_where || ' AND a.' || col_name || ' = b.' || col_name;
    END LOOP;
    SELECT
        column_name INTO col_name
    FROM
        information_schema.table_constraints tc
        JOIN information_schema.constraint_column_usage ccu ON
            ccu.constraint_name = tc.constraint_name
    WHERE
        tc.table_catalog = schema_name
        AND tc.table_schema = 'public'
        AND tc.table_name = table1
    LIMIT 1
    ;
    IF col_name IS NOT NULL THEN
        sql_statement_pk := ' a.' || col_name || ' IS NULL';
        sql_statement_pk := sql_statement_pk || ' OR b.' || col_name || ' IS NULL';
    END IF;
    sql_statement := 'SELECT COUNT(*) FROM ' || schema_name || '.public.' || table1 || ' a FULL OUTER JOIN ' || schema_name || '.public.' || table2 || ' b ON TRUE';
    sql_statement := sql_statement || sql_statement_where || ' WHERE ' || sql_statement_pk;
EXECUTE sql_statement INTO the_result;
    RETURN the_result;
END;$BODY$
LANGUAGE 'plpgsql' STABLE;
文章:来源
转载于酷壳CoolShell 无删改 仅以此纪念陈皓(左耳朵耗子)
目前在用群晖的 218+,感觉有点太老了,想更新一下 nas 。 然后看到最近很多朋友在看 nas ,就比较关注这类产品。 今天看到群晖宣布 DiskStation Plus …
你会用什么样的算法来为你的用户保存密码?如果你还在用明码的话,那么一旦你的网站被hack了,那么你所有的用户口令都会被泄露了,这意味着,你的系统或是网站就此完蛋了。所以,我们需…
我们有很多Coding Style 或 代码规范。但这一条可能会经常被我们所遗忘,就是我们经常会在函数的参数里使用bool参数,这会大大地降低代码的可读性。不信?我们先来看看下…
合速度